Concepts · 9 min read
What is an EVM indexer?
An EVM indexer turns the onchain data of an EVM chain (blocks, transactions, event logs, and optionally traces and state) into queryable database tables. The pipeline shape matches any other blockchain indexer, but the EVM's typed event logs and its reorg model shape what the decoder and the real-time path have to do. This guide is the EVM counterpart to what is a Solana indexer.
1. What is an EVM indexer?
An EVM indexer is a service that ingests data from an EVM-compatible chain and stores it in tables the application can query analytically. "EVM" means the Ethereum Virtual Machine: the execution environment Ethereum mainnet runs, shared by Base, Arbitrum, Polygon, BNB Chain, Optimism, Avalanche, and most L2s and alt-L1s. Because they share the EVM, they share a data model, and an indexer built for one indexes the others with little more than a new endpoint and a start block.
What separates an EVM indexer from a Solana indexer is not the role (read chain, decode, store, serve) but the work inside each step. EVM contracts emit typed event logs: structured records with a signature hash and ABI-encoded arguments. That makes the indexer's decoding job a matching problem, line up each log against the event definitions in a contract's ABI. Solana has no events in that sense, so its indexer keys off instruction discriminators instead.
Most of this article is about the EVM specifics: what the data model exposes, how decoders match logs to ABIs, how reorgs are handled, and why the same code carries across EVM chains.
2. The EVM data model: blocks, transactions, logs, traces
A block contains a header (number, hash, parent hash, timestamp, gas) and a list of transactions. Each transaction produces a receipt. That chain of containment, block to transaction to receipt to log, is what the indexer walks.
Transactions carry from, to, value (wei transferred), input (the calldata that selects and parameterises a contract call), nonce, and gas fields. The input field is opaque bytes until decoded against the target contract's ABI.
Receipts record what a transaction did: a success or failure status, the gas used, and the logs the transaction emitted. Logs are the workhorse of EVM indexing.
Event logs are the primary decoded surface. A contract emits a log with up to four topics and a data blob. The first topic identifies the event; the rest carry its indexed arguments; the data blob carries the non-indexed arguments, ABI-encoded. The eth_getLogs RPC method filters logs by contract address, topics, and block range, which is the request shape most indexers and RPC-based queries are built on.
Traces expose the internal message calls a transaction made: contract-to-contract CALLs, DELEGATECALLs, contract creations, and native ETH moved between contracts without emitting a log. Traces come from tracing methods (debug_traceTransaction, trace_block) that require a tracing-enabled archive node, so they are more expensive to produce. Indexers pull traces only when the use case needs the internal call tree or unlogged value movement.
State is the current value of every account balance, nonce, and contract storage slot. RPC reads it directly (eth_getBalance, eth_call, eth_getStorageAt). Indexers usually reconstruct the state they need from the event stream rather than reading storage slot by slot, because the events already describe every change.
3. How decoders identify events: topics, signatures, ABIs
A log's first topic, topic0, is the keccak-256 hash of the event's canonical signature. For a token transfer that signature is the string Transfer(address,address,uint256), and its hash is the same on every EVM chain and in every contract that declares that event. The decoder filters by contract address and topic0, then knows which ABI definition the rest of the log conforms to.
Arguments split two ways. Indexed parameters become topics (topics[1] through topics[3]), which makes them filterable but caps a normal event at three of them. Non-indexed parameters are ABI-encoded into the data blob. One sharp edge: an indexed parameter of a dynamic type (a string, bytes, or array) is stored as its keccak hash, not its value, so the original cannot be recovered from the log alone.
The canonical gotcha is the shared Transfer event. ERC-20 and ERC-721 both declare an event named Transfer whose canonical signature is identical, Transfer(address,address,uint256), so both produce the same topic0. They differ in what is indexed: ERC-20 indexes from and to (three topics in total), while ERC-721 also indexes the tokenId (four topics in total). A decoder tells a fungible transfer from an NFT transfer by counting the topics, not by the signature. An indexer that assumes one or the other miscounts supply and ownership.
Decoding a contract call rather than an event uses the same idea on the transaction's input: the first four bytes are the function selector, the first four bytes of the keccak hash of the function signature, and the rest is the ABI-encoded arguments. This matters when indexing from traces or raw transactions instead of logs.
4. How an EVM indexer works
The stages mirror any indexer (see the indexer overview), with EVM-specific details at each step.
Extract. Three sources are practical. eth_getLogs against an RPC endpoint serves filtered logs directly, but providers cap the block range and result count per call, so backfilling a long history means many sequential calls and wall-clock time that scales linearly (covered in RPC vs indexed data). Downloading whole blocks and filtering locally trades bandwidth for fewer round-trips. A managed data lake, such as the SQD Portal, serves filtered ranges over HTTP with full history available without operating an archive node.
Decode. For each log, the indexer matches topic0 against an ABI event registry and deserialises the topics and data into typed fields. The real complication is proxies: most production contracts are upgradeable proxies, so logs are emitted from the proxy address while the ABI that decodes them belongs to the implementation. The indexer maps proxy to implementation (the EIP-1967 storage slots hold the implementation address) and re-resolves when an upgrade changes it.
Transform. Common EVM transformations: scaling raw integer amounts by a token's decimals, resolving addresses to known labels, and joining events across contracts (a swap touches a pool, a router, and two token contracts in one logical action).
Store and serve. Postgres is the common default; ClickHouse and other columnar stores suit high-volume analytical workloads. The serving layer is GraphQL, SQL, or REST. SQD's Squid SDK ships GraphQL out of the box; its Pipes SDK streams decoded data into a store the team chooses.
5. Reorgs and finality
Near the chain tip, blocks are not yet settled and can be replaced. On Ethereum's proof-of-stake consensus, a block moves from proposed to safe and then finalized, where finalization takes roughly two epochs (about 13 minutes); a finalized block will not be rolled back short of an extreme consensus failure. Many chains and L2s instead offer probabilistic finality, where confidence grows with confirmations rather than a hard finalized flag.
An indexer has two ways to deal with this. It can index the unfinalized head for low latency and roll back affected rows when a block is reorganised, which means every write has to be reversible. Or it can commit only up to a finalized or safe depth, which avoids rollbacks but lags the tip by the finality window. Most production indexers run both: a fast unfinalized view for live UI and a settled finalized view for anything that must be correct.
L2s add a wrinkle. A rollup's sequencer confirms transactions quickly, but they are only truly settled once posted to and finalized on the L1. Until then the ordering can in principle change. An indexer that treats a fast L2 soft-confirmation as final accepts a small reorg risk in exchange for latency, which is usually the right trade for display and the wrong one for accounting.
6. Decoding the common EVM standards
Most applications need decoded data for a handful of widely-used standards. The list below covers what production EVM indexers ship decoders for and the practical wrinkles each introduces.
ERC-20 (fungible tokens). The Transfer and Approval events cover most token-flow indexing. Wrinkles: some older tokens deviate from the standard (missing return values, non-standard decimals), rebasing tokens change balances without emitting a Transfer, and fee-on-transfer tokens move less than the event amount. A balance materialised purely from Transfer events will be wrong for those cases unless the indexer accounts for them.
ERC-721 (NFTs). The Transfer event indexes tokenId as a third topic, which is how an indexer tells it apart from an ERC-20 transfer. The tokenURI points at off-chain metadata (commonly IPFS or Arweave); production NFT pipelines cache the resolved metadata and re-fetch on changes. ApprovalForAll matters for marketplace indexing.
ERC-1155 (multi-token). TransferSingle and TransferBatch carry an operator alongside from and to, and a single batch event can move many token IDs at once, so the decoder has to fan a batch out into per-token rows.
Proxies and upgrades. Transparent proxies, UUPS proxies, and diamond (EIP-2535) proxies all separate the address that emits logs from the code that defines them. An indexer that hardcodes one ABI per address breaks the moment a contract upgrades; the durable approach versions the implementation ABI over block ranges.
DeFi protocol events. AMMs (Uniswap V2 Swap/Sync/Mint/Burn, V3 with tick and liquidity fields), lending markets, and perps each emit protocol-specific events with their own decoders. Aggregators route through several venues in one transaction, so a volume index that only watches one protocol undercounts. The multi-chain indexing guide covers stitching these into one schema across chains.
7. How the patterns carry across EVM chains
The decoding model is identical across EVM chains. A topic0 hash, an ABI, and the ERC standards mean the same decoder runs unchanged on Ethereum, Base, Arbitrum, Polygon, Optimism, BNB Chain, and Avalanche. Many contracts are even deployed at the same address across chains via deterministic CREATE2 deployment, so one ABI mapping can apply everywhere. What changes between chains is operational, not structural.
Block time and throughput. Ethereum mainnet produces a block about every 12 seconds; many L2s are far faster, from roughly two seconds down to sub-second. Faster blocks mean more blocks and more events per unit time, so an indexer that keeps up on Ethereum may need more throughput on a high-traffic L2.
Finality and reorg semantics. Each chain settles differently, and L2 finality is tied to L1 settlement. The indexer's safe and finalized depths have to be configured per chain rather than assumed.
Chain-specific quirks. Arbitrum's ArbOS adds transaction types and a different gas model; OP Stack chains insert system transactions; Polygon PoS has its own checkpoint cadence. These surface at the block and transaction level, but at the decoded-event level the data is still ordinary EVM. The practical result: adding an EVM chain to an existing indexer is mostly a configuration change, a new endpoint and start block, rather than a rewrite.
8. EVM indexing tools in 2026
Tools commonly used for EVM indexing in 2026. For head-to-heads against SQD, see the comparison pages.
Author-your-own frameworks. The Graph's subgraphs (mappings in AssemblyScript) are the most widely deployed. SQD's Squid SDK and Pipes SDK (TypeScript) and Ponder (TypeScript) take a batch-processor approach, and Envio is another TypeScript option. These give the most control over schema and decoding.
Managed and hosted. Goldsky hosts The Graph-compatible subgraphs and streaming pipelines so a team avoids running indexing infrastructure. RPC providers such as Alchemy, QuickNode, and Infura serve the raw data underneath, and some layer their own data APIs on top.
SQD's EVM surface. The Portal data lake serves filtered, decoded ranges over HTTP across the EVM chains listed at sqd.dev/chains, with full history from genesis. The Squid SDK ships GraphQL by default; the Pipes SDK streams decoded EVM data into Postgres or ClickHouse with reorg handling built in.
9. EVM-specific vs multi-chain indexer
A single-EVM-chain product can pick almost any EVM tool and be well served. The decoding model is shared, so the main axes are hosting, schema flexibility, latency, and cost rather than coverage.
A product that spans several EVM chains, or EVM plus a non-EVM ecosystem like Solana or Bitcoin, benefits from a multi-chain indexer with a unified schema. One query layer covers every chain, and adding the next one is a configuration change. When the span crosses VMs, the value compounds: the same platform that handles the topic0-and-ABI EVM model also has to handle Solana's discriminator model, and a tool that does both spares the team a parallel stack.
The multi-chain indexing guide covers the architectural trade-offs, and the evaluation framework applies axis by axis: chain coverage, data shape, latency budget, hosting model, pricing, and lock-in.
Frequently asked questions
What is an EVM indexer?
How is EVM indexing different from Solana indexing?
What are event topics in Ethereum?
Why do ERC-20 and ERC-721 transfers have the same event signature?
Do I need traces to index an EVM chain?
How does an EVM indexer handle reorgs?
Which tools index EVM chains in 2026?
Related guides
Index any EVM chain with SQD
The Portal serves decoded EVM data across the chains listed at sqd.dev/chains. The Squid and Pipes SDKs decode logs, traces, and proxy upgrades in TypeScript.