Concepts · 9 min read
What is onchain analytics?
Onchain analytics turns a blockchain's history into metrics and dashboards. The chart is the easy part; the work is the pipeline behind it. This guide walks that pipeline with real queries: decoding raw logs, computing a transfer-volume metric end to end, and reconstructing cross-chain stablecoin supply from mint and burn events. It closes on the choice every team faces, hosted SQL versus your own warehouse.
1. What it is, and the pipeline shape
Onchain analytics measures activity recorded on a blockchain: active users, trading volume, total value locked, fee revenue, token supply, holder distribution. A block explorer answers point questions about a single transaction; analytics answers aggregate questions across millions of records, which is why it runs against a database rather than a node.
Every dashboard is the end of a four-stage pipeline. Extract the relevant blocks, logs, and (sometimes) traces. Decode the raw hex into named events with typed fields. Model the decoded data into tables shaped around your questions, handling reorgs and normalizing across chains. Serve it from a warehouse that a BI tool queries with SQL. The sections below work through the two stages that actually take the time, decode and model, with real queries.
2. The decode step is the crux
A raw event log is a set of indexed topics plus a hex data blob. On its own it cannot be summed or grouped. Applying the contract ABI turns it into a Transfer with a named from, to, and value, which is the form analytics needs. The signature that identifies the event is topic0, the keccak256 hash of the event signature; for an ERC-20 Transfer(address,address,uint256) that is 0xddf252ad...b3ef.
Get this stage wrong and every number downstream is wrong: a mishandled proxy contract, an event signature that changed across an upgrade, or an ERC-721 transfer counted as an ERC-20 amount all surface as a bad figure on the dashboard. Decoding correctly across a contract's full history is most of the engineering, which is covered in what an EVM indexer handles.
3. A worked metric: transfer volume
Take a concrete question: daily transfer volume for the major stablecoins. The extract step is one query against SQD's Portal, filtering the Transfer event across the USDC, USDT, and DAI contracts (multiple addresses in one filter is OR logic):
POST https://portal.sqd.dev/datasets/ethereum-mainnet/stream
Accept: application/x-ndjson
{
"type": "evm",
"fromBlock": 18000000,
"toBlock": 18010000,
"logs": [{
"address": [
"0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48",
"0xdAC17F958D2ee523a2206206994597C13D831ec7",
"0x6B175474E89094C44Da98b954EedeAC495271d0F"
],
"topic0": ["0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef"]
}],
"fields": {
"block": { "number": true, "timestamp": true },
"log": { "address": true, "topics": true, "data": true }
}
}
That query streams raw logs as newline-delimited JSON: the transferred amount is a hex blob in data, the token is the log's address. Turning that into a number means decoding each log against the ERC-20 ABI and summing by token, which is what an SDK does for you. The same range through SQD's Pipes SDK decodes every Transfer and totals volume by token:
import { evmDecoder, evmPortalStream, commonAbis } from '@subsquid/pipes/evm'
// USDC, USDT, DAI on Ethereum, with their decimals
const TOKENS: Record<string, { symbol: string; decimals: number }> = {
'0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48': { symbol: 'USDC', decimals: 6 },
'0xdac17f958d2ee523a2206206994597c13d831ec7': { symbol: 'USDT', decimals: 6 },
'0x6b175474e89094c44da98b954eedeac495271d0f': { symbol: 'DAI', decimals: 18 },
}
const stream = evmPortalStream({
id: 'stablecoin-volume',
portal: 'https://portal.sqd.dev/datasets/ethereum-mainnet',
outputs: {
erc20: evmDecoder({
range: { from: 18_000_000, to: 18_001_000 },
contracts: Object.keys(TOKENS),
events: { transfers: commonAbis.erc20.events.Transfer },
}),
},
})
const volume: Record<string, number> = {}
for await (const { data } of stream) {
for (const t of data.erc20.transfers) {
const { symbol, decimals } = TOKENS[t.contract.toLowerCase()]
volume[symbol] = (volume[symbol] ?? 0) + Number(t.event.value) / 10 ** decimals
}
} Over the first 1,000 of those blocks (roughly 3.4 hours of mainnet), that totals 22,318 transfers worth about $542M: $318.8M USDC, $203.4M USDT, $19.5M DAI. Because the block window is fixed history, the same query returns the same totals on every run.
The loop above holds the totals in memory. To slice them by day and keep them queryable next to everything else, you persist the decoded rows into a warehouse table and let SQL group them. In ClickHouse, a daily rollup over a transfers table you populated looks like:
SELECT toDate(block_time) AS day,
token,
sum(amount) AS volume,
count() AS transfers
FROM transfers
GROUP BY day, token
ORDER BY day;
That is the whole shape of an onchain metric: a filtered extract, a decode, a table, a GROUP BY. The table schema is yours; the part that is easy to get wrong is everything to the left of the SQL.
Two interfaces, one dataset. The raw query and the SDK above reach the same Portal data at different levels. The Portal Stream API is the data layer itself: a streaming HTTP endpoint that serves the same filtered logs, transactions, and traces you would otherwise pull from RPC, with full history and in any language. Teams run it in production as the data source, not just for one-off questions (the trade-off against RPC is in RPC vs indexed data); you apply the contract ABI and aggregate the result yourself. The Pipes SDK sits on top of that same API in TypeScript, decoding logs into typed events and streaming them into a target store such as ClickHouse or Postgres, with progress tracking and reorg rollback handled for you.
| Approach | You handle | Reach for it when |
|---|---|---|
| Portal Stream API | Decoding, aggregation, and reorg bookkeeping | Any stack or language, or running in place of RPC for history and bulk reads |
| Pipes SDK | Little: decode, target writes, and reorg rollback are built in | A durable TypeScript indexer or pipeline |
The rule of thumb: reach for the API when you want the data in your own stack, the SDK when you want a TypeScript pipeline with decoding, storage, and reorgs built in.
4. A harder one: cross-chain stablecoin supply
Supply is more interesting than volume because it has to be reconstructed from issuance, and because a major stablecoin does not live on one chain. The mechanics onchain are simple: a mint is a Transfer from the zero address, a burn is a Transfer to the zero address, and circulating supply is the running total of mints minus burns. You capture both with two filters on the token's Transfer event, with the zero address in topic1 for mints and topic2 for burns:
{
"type": "evm",
"fromBlock": 18000000,
"toBlock": 18010000,
"logs": [
{ "address": ["0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48"],
"topic0": ["0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef"],
"topic1": ["0x0000000000000000000000000000000000000000000000000000000000000000"] },
{ "address": ["0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48"],
"topic0": ["0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef"],
"topic2": ["0x0000000000000000000000000000000000000000000000000000000000000000"] }
],
"fields": { "block": { "number": true, "timestamp": true },
"log": { "address": true, "topics": true, "data": true } }
} The cross-chain part is where it gets genuinely hard. The same brand of stablecoin is issued natively on several chains, and some issuers move supply between chains by burning on the source and minting on the destination (the pattern behind Circle's Cross-Chain Transfer Protocol for USDC). So a burn on Ethereum paired with a mint on Base is not a net change in total supply, it is the same dollars relocating. Measuring true circulating supply means running the mint-and-burn query on every chain a stablecoin is deployed on and reconciling the cross-chain burn-and-mint pairs, which is the general problem in multi-chain indexing. The query is the same shape on each chain; only the dataset name changes.
5. Hosted SQL vs your own warehouse
Hosted SQL services. Dune and Flipside maintain a curated dataset and let you write SQL on top. They are the fastest path to an ad-hoc query or a public dashboard, with no pipeline to run. The trade-offs are their chain coverage, the shape of their dataset, hosted-query rate limits, and the fact that the data lives on their platform. For a direct comparison, see SQD vs Dune.
Your own warehouse. The worked examples above are this path: you extract decoded data, model it into your own ClickHouse or BigQuery, and run BI on top. You control which chains, the schema, the freshness target, and where the data sits. The cost is owning the pipeline, although sourcing the decoded extract from a provider leaves you owning only the modelling. The recurring hard parts are the same across every metric: decoding, cross-chain normalization, reorg handling so unfinalized blocks do not double-count, and backfilling history from genesis.
6. Onchain analytics with SQD
SQD sits at the extract-and-decode end of this pipeline. The queries above are the real interface: filtered, decoded logs (and traces and state diffs where a metric needs them) streamed from the Portal, with the same query shape across every dataset listed at sqd.dev/chains and full history from genesis.
For a maintained pipeline rather than one-off requests, the Pipes SDK streams decoded data into ClickHouse, Postgres, or Parquet, and the Squid SDK persists to Postgres with a GraphQL API. Your BI tool then queries the resulting store with ordinary SQL. The full picture of how this feeds a warehouse is on the analytics solution page.
Frequently asked questions
What is onchain analytics?
What data do you need for onchain analytics?
How do you measure a stablecoin's supply onchain?
Should I use a hosted SQL service or build my own pipeline?
Can onchain analytics be real-time?
Related guides
Building onchain analytics?
See how teams pipe decoded data into the warehouse they already use on the analytics solution page, or compare the hosted-SQL route in SQD vs Dune.