Concepts · 11 min read

AI agents and onchain data: what an agent needs to act on blockchain state

An AI agent that touches a blockchain has to read its state, reason over it, and sometimes act on it. None of that works if the data arrives as raw hex. This guide covers what an agent actually needs from onchain data, the Model Context Protocol (MCP) as a common pattern for exposing it, and how on-demand and pre-indexed pipelines trade off in agent workflows, with the real queries underneath.

Updated 2026-06-04 · By the SQD team

1. What an agent needs from onchain data

Strip away the specific use case and an agent's requirements from onchain data come down to three things. It needs read access to current and historical state. It needs that data structured and decoded, in a shape a language model can reason over. And it needs to fetch it on demand, as a tool the model can invoke while it is working, not as a static document prepared in advance.

A trading agent checking a position, a research assistant answering a question, and a monitoring agent watching for an event all reduce to the same loop: decide what to look up, fetch decoded data, reason, repeat. The data layer's job is to make each fetch fast, correct, and in a format the model does not have to fight.

2. The format problem

Raw blockchain data is close to unusable for a language model. An event log is a set of indexed topics and a hex data blob. Calldata is a four-byte function selector followed by packed, ABI-encoded arguments. Amounts are big integers denominated in base units, with the decimals defined elsewhere. A model handed that hex has nothing to reason with.

What an agent can use is the decoded form: a Transfer with a named from, to, and an amount in human-readable units, returned as typed JSON. Producing that means applying the contract ABI, the same decoding step described in what an EVM indexer handles. For agents this is not a nicety; it is the line between data the model can act on and data it cannot.

3. The Model Context Protocol pattern

The Model Context Protocol (MCP) is an open standard for connecting AI assistants to external tools and data. An MCP server publishes a set of tools and resources; an MCP-compatible host, such as a chat client, a coding assistant, or a custom agent runtime, discovers those tools and calls them on the model's behalf. The value of the standard is that it replaces one-off, per-source integrations with a single interface the model already knows how to use.

For onchain data, an MCP server turns "look up this address" or "count these transfers" into tools the agent calls directly, getting structured results back. SQD provides an open-source MCP server (subsquid-labs/portal-mcp-server) that wraps the SQD Portal API; the MCP server documentation covers connecting it to a host such as Claude, Cursor, or VS Code.

The point of the wrapper is that the agent works in questions, not request formats. A prompt like "how many USDC transfers above 1M moved on Base in the last day" becomes a tool call; the server resolves it to a Portal query against the right dataset and hands back rows the model can read, rather than the topics-and-data hex a node would return. The shape of that underlying query is in the next section.

4. On-demand vs pre-indexed

Onchain data reaches an agent through one of two pipelines, and they trade off in opposite directions.

On-demand. The agent issues a query at runtime, for example through an MCP tool or a direct API call, and gets fresh data back. It is the least to set up, always current, and well suited to open-ended questions where you cannot predict the access pattern. The costs are latency on every call and a dependency on the query service being available at the moment the agent decides.

A direct on-demand fetch is an ordinary Portal request. The USDC-on-Base lookup from the previous section is a logs query filtered to the Transfer event:

POST https://portal.sqd.dev/datasets/base-mainnet/stream

Accept: application/x-ndjson

{

"type": "evm",

"fromBlock": 20000000,

"toBlock": 20000500,

"logs": [{

"address": ["0x833589fcd6edb6e08f4c7c32d4f71b54bda02913"],

"topic0": ["0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef"]

}],

"fields": { "log": { "topics": true, "data": true, "transactionHash": true } }

}

The agent (or the MCP tool in front of it) decodes each row into a typed Transfer with a named sender, recipient, and amount. That decoded JSON is what enters the model's context; the hex never does.

Pre-indexed. The agent reads from a dataset you maintain ahead of time: a warehouse table, a cache, or a vector store. Reads are fast and deterministic, which suits known access patterns and high request volumes. The costs are the pipeline you run to keep the dataset current and the staleness window between updates. Building that pipeline is ordinary indexing work, covered across the blockchain data API guide.

Most production agent systems use both: pre-index the hot path the agent hits constantly, and query on demand for the long tail it hits rarely.

5. Where it gets hard

Context limits. Onchain history is large and a context window is small. You cannot pour a contract's full history into a prompt; the agent needs targeted queries that return only what the current step requires.
Freshness. State changes every block. An agent acting on stale data acts wrongly, so the freshness target of the data layer is part of the agent's correctness.
Verifiability. An agent should be able to cite what it acted on, with block numbers and transaction hashes, so a decision can be audited after the fact rather than taken on trust.
Multi-chain. An agent reasoning across chains needs one normalized interface, not a separate integration per network, the problem described in multi-chain indexing.
Cost and determinism. Repeated live queries add latency and expense; caching and pre-indexing keep an agent loop affordable and repeatable.

6. Onchain data for agents with SQD

SQD delivers decoded, typed onchain data as JSON across the networks it indexes, which is the format an agent needs. Agents reach it two ways. The open-source SQD MCP server exposes the SQD Portal to any MCP-compatible host for the on-demand path. For custom agent loops or pre-indexed datasets, the SQD Portal API and the SDKs deliver the same decoded data into your own store.

For the broader picture of building agent integrations, see the AI development documentation, and the AI agents solution page for how teams wire this into a running agent.

Frequently asked questions

What does an AI agent need from blockchain data?

Three things. Read access to current and historical state; that data in a structured, decoded form a language model can reason over rather than raw hex; and a way to fetch it at the moment the agent is reasoning, exposed as a tool the agent can call. Whether the agent is answering a question, monitoring a position, or deciding on an action, those requirements stay the same.

What is the Model Context Protocol (MCP)?

MCP is an open standard for connecting AI assistants to external tools and data sources. An MCP server exposes a set of tools and resources, and an MCP-compatible host (a chat client, a coding assistant, or a custom agent runtime) can call them on the model's behalf. For onchain data, an MCP server lets an agent ask blockchain questions and receive structured answers without a developer hand-wiring a separate integration for each data source.

Can an LLM read raw blockchain data?

Not usefully. A raw log is indexed topics plus a hex data blob, calldata is a function selector plus packed arguments, and amounts are big integers in base units. None of that is meaningful to a language model without decoding. The agent needs the decoded form: a named event with typed fields and human-readable values, as JSON. Decoding with the contract ABI is a precondition for agents, not an optimization.

On-demand or pre-indexed data: which does an agent need?

On-demand queries fetch fresh data at runtime, which is simplest to set up, always current, and good for open-ended questions, at the cost of per-call latency and a dependency on the query service at decision time. Pre-indexed pipelines read from a dataset you maintain ahead of time, which is fast and deterministic for known access patterns and high request volumes, at the cost of running the pipeline and a staleness window. Many agent systems do both: pre-index the hot path, query on demand for the long tail.

Does SQD have an MCP server?

Yes. SQD provides an open-source MCP server (subsquid-labs/portal-mcp-server) that wraps the SQD Portal API, so an MCP-compatible host can query onchain data across the networks SQD supports and get typed results back. Setup details are in the SQD MCP server documentation at docs.sqd.dev/en/ai/mcp-server.

Concepts

Building an agent that reads onchain data?

See the MCP server and the decoded-JSON data layer behind agent workflows on the AI agents solution page.

AI agents with SQD MCP server docs

AI agents and onchain data: what an agent needs to act on blockchain state

1. What an agent needs from onchain data

2. The format problem

3. The Model Context Protocol pattern

4. On-demand vs pre-indexed

5. Where it gets hard

6. Onchain data for agents with SQD

Frequently asked questions

What is a blockchain indexer?

What is a Solana indexer?

What is a subgraph?

Building an agent that reads onchain data?

1. What an agent needs from onchain data

2. The format problem

3. The Model Context Protocol pattern

4. On-demand vs pre-indexed

5. Where it gets hard

6. Onchain data for agents with SQD

Frequently asked questions

Related guides

What is a blockchain indexer?

What is a Solana indexer?

What is a subgraph?

Building an agent that reads onchain data?