Data access · 8 min read

RPC vs indexed data: which do you actually need?

The first decision when reading blockchain data: RPC or indexer. The right answer depends on what your application asks of the data. Current-state lookups want RPC. Historical scans, aggregations, and multi-contract queries want indexed data. This guide covers what each does well, where each breaks down, and the hybrid pattern most production applications end up at.

Updated 2026-05-15 · By the SQD team

1. What an RPC endpoint actually serves

A JSON-RPC endpoint is the request-response interface a blockchain node exposes. Applications call methods like eth_getBlockByNumber, eth_getBalance, eth_call, and eth_getLogs over HTTP or WebSocket and get back chain state. The node either runs locally or is provided by a service.

RPC is good at what it was designed for: serving current chain state and recent activity to applications and other nodes. The data it returns is raw and undecoded: a transaction is a hex blob, an event log is a topics array plus a data blob, account state is a uint256 balance plus a bytes blob for the code. The application is expected to decode anything it needs using the contract ABI.

What RPC is not designed for: analytical queries across many blocks, joins across contracts, aggregations, or sub-second latency on historical data. The interface has no aggregation primitives; the storage layer behind it is optimized for point lookups, not scans.

2. What indexed data is

Indexed data is what comes out of a blockchain indexer: structured database tables produced by reading the chain, decoding events against ABIs, and storing the decoded result. The same logical action (a token transfer, a swap, an NFT mint) lives in a typed row with named columns, ready for SQL or GraphQL queries.

Indexed data answers analytical questions natively. "Total trading volume on Uniswap last week" is one SQL aggregate. "All transfers received by this address across all chains in the last year" is one indexed query. The work of fetching, decoding, and aggregating has already been done.

What indexed data is not designed for: writing transactions, calling read functions on contracts (eth_call), or serving the absolute latest block within the same millisecond it lands. Indexers ingest from a chain source and have at minimum the ingestion lag between chain tip and database write.

3. When RPC is enough

Three patterns where RPC alone is the right tool.

Writing transactions. Every wallet and every application that submits transactions needs RPC. eth_sendRawTransaction is an RPC method; there is no indexer equivalent because indexers serve the read side.

Current-state lookups. "What is this address's balance right now?" or "What does this contract's getReserves function return for this pool?" are single point queries against current state. RPC answers them directly. An indexer would either not have this data (if it doesn't materialise it) or would have it stale.

Short, recent windows. "Get all logs from this contract in the last 100 blocks" is well within eth_getLogs's comfort zone. Apps polling for recent activity, monitoring scripts, alerting systems often live in this range and don't need anything more than RPC.

For these three patterns, adding an indexer is overkill: more infrastructure to operate, more latency than necessary, and no benefit over a well-chosen node provider.

4. When you need an indexer

The threshold question is whether the application needs decoded history, aggregations, or cross-contract queries. If yes, RPC alone will not get you there.

Full address history. A wallet that renders a user's complete transaction and token transfer history across all the chains the user holds assets on. RPC has no efficient way to answer "every transfer this address has ever sent or received"; the app would have to scan every block of every chain. An indexer materialises this view once and serves it in milliseconds.

Aggregations. Anything with SUM, COUNT, AVG, or GROUP BY. DEX volume, lending TVL, supply changes, fee revenue. The aggregation either happens at query time (against indexed data) or in the application layer (against raw RPC data the app has to scan first). The first is fast; the second is days of wall-clock time.

Multi-contract or multi-chain queries. "All swaps across Uniswap, Sushiswap, and Curve in this time window" needs decoded data from multiple sources joined on a common shape. "All token transfers received by this address across Ethereum, Base, and Arbitrum" needs multi-chain decoded data. Both want an indexer.

Decoded, queryable history at scale. Analytics platforms, compliance tools, AI agents reading onchain state, anything that asks "what happened" rather than "what is now". An indexer is the only practical answer.

5. The hidden costs of each path

Both paths have costs that are easy to underestimate when starting out.

RPC: cost of stretching it past its design. Applications that start "RPC-only" often grow into territory RPC isn't built for. The symptoms appear gradually: page-load times grow as the app makes more RPC calls per page, costs from the RPC provider climb because each request is metered, and at some point an eth_getLogs call hits the provider's range cap and the team starts paginating. The further the app pushes RPC, the more it pays in latency and infrastructure to keep RPC working in a role it wasn't designed for.

Indexer: cost of operating it. A self-hosted indexer has real costs: archive nodes (multiple terabytes of SSD per chain), the database, the indexer process, monitoring, on-call. A hosted indexer service has subscription cost that scales with usage. Either way, the team is paying for capabilities the application may not yet need. Indexers are over-provisioned for current-state lookups and writes.

The trick is to recognise which side of the threshold the application is on. A monitoring script that polls a contract every minute doesn't need an indexer. A DEX analytics dashboard does. A wallet that supports five chains needs both: RPC to write transactions and check current balances, indexed data to render history.

6. The hybrid pattern most production apps use

Most production applications use both. The split tends to fall along this line.

Use RPC for:

  • Submitting transactions (eth_sendRawTransaction).
  • Reading current contract state (eth_call, eth_getBalance).
  • Subscribing to pending or recent blocks (eth_subscribe).
  • Anything that must be absolutely fresh.

Use indexed data for:

  • History (transfers, swaps, mints, anything that happened in the past).
  • Aggregations (volumes, counts, balances over time).
  • Cross-contract or cross-chain queries.
  • Latency-sensitive paths through any of the above.

The architecture is straightforward: the application has an RPC client and an indexed-data client. Each page or feature picks the right one for its query. Where the lines blur (a wallet showing balances "as of now" but tied to history), the indexer is usually within seconds of the chain tip and good enough for the visible UI, with the RPC client reserved for actions the user takes (submitting a transaction).

See the indexer evaluation framework for picking the indexer side of the split. The SQD vs Alchemy comparison covers one specific axis: when a team's RPC provider also offers indexed APIs, how those compare to a dedicated indexer.

Frequently asked questions

What is the difference between RPC and indexed data?
An RPC endpoint serves chain state through methods like eth_getBlockByNumber, eth_getBalance, and eth_getLogs. It returns raw, undecoded data for a single chain. Indexed data is the output of an indexer that has read the chain, decoded the raw primitives against contract ABIs, and stored the result in a database designed for analytical queries (filtering, aggregation, joins, full-history scans).
Do I need an indexer if I'm just reading the latest block?
No. For most current-state reads (latest balance, current price, latest block), an RPC node or RPC provider is the right choice. Indexers earn their cost when the application needs decoded history, aggregations, or cross-contract queries that JSON-RPC is not designed for.
What is eth_getLogs and why does it have rate limits?
eth_getLogs is the JSON-RPC method that returns event logs matching a filter (contract address, topic, block range). RPC providers cap the block range and result count per call (typical caps are in the thousands of blocks or logs, varying by provider and tier) to prevent expensive scans from monopolising shared infrastructure. Backfilling a long history through eth_getLogs requires many sequential calls and the wall-clock time scales linearly.
Can I use RPC for analytical queries?
In principle yes, in practice no. JSON-RPC has no aggregation primitives; computing "total trading volume on Uniswap last week" via RPC means pulling every Swap event for every Uniswap pool and aggregating client-side. For one-off scripts this is fine; for production analytics it is too slow, too expensive, and too brittle.
How much does running an archive node cost?
An Ethereum mainnet archive node typically requires multiple terabytes of SSD per chain (more if it also traces), a multi-core CPU, and reliable bandwidth. Hardware and hosting costs vary, but a self-hosted archive node's monthly cost lands in the hundreds of dollars at minimum for a single chain, and multiplies per additional chain.
What's the cheapest way to get historical blockchain data?
For one-off backfills, public archive nodes and snapshot dumps (parquet exports, Erigon-snap downloads) are often free or low-cost. For ongoing access to decoded history, a managed data source or a decentralized data network is usually cheaper than running an archive node yourself once you factor in operational time. The break-even depends on usage volume; published rate cards from candidate providers are the input to that comparison.

Indexed data for the queries RPC can't answer

Portal serves decoded history across the chains listed at sqd.dev/chains. The Squid and Pipes SDKs let you shape the data however the application needs it.