Concepts · 11 min read
AI agents and onchain data: what an agent needs to act on blockchain state
An AI agent that touches a blockchain has to read its state, reason over it, and sometimes act on it. None of that works if the data arrives as raw hex. This guide covers what an agent actually needs from onchain data, the Model Context Protocol (MCP) as a common pattern for exposing it, and how on-demand and pre-indexed pipelines trade off in agent workflows, with the real queries underneath.
1. What an agent needs from onchain data
Strip away the specific use case and an agent's requirements from onchain data come down to three things. It needs read access to current and historical state. It needs that data structured and decoded, in a shape a language model can reason over. And it needs to fetch it on demand, as a tool the model can invoke while it is working, not as a static document prepared in advance.
A trading agent checking a position, a research assistant answering a question, and a monitoring agent watching for an event all reduce to the same loop: decide what to look up, fetch decoded data, reason, repeat. The data layer's job is to make each fetch fast, correct, and in a format the model does not have to fight.
2. The format problem
Raw blockchain data is close to unusable for a language model. An event log is a set of indexed topics and a hex data blob. Calldata is a four-byte function selector followed by packed, ABI-encoded arguments. Amounts are big integers denominated in base units, with the decimals defined elsewhere. A model handed that hex has nothing to reason with.
What an agent can use is the decoded form: a Transfer with a named from, to, and an amount in human-readable units, returned as typed JSON. Producing that means applying the contract ABI, the same decoding step described in what an EVM indexer handles. For agents this is not a nicety; it is the line between data the model can act on and data it cannot.
3. The Model Context Protocol pattern
The Model Context Protocol (MCP) is an open standard for connecting AI assistants to external tools and data. An MCP server publishes a set of tools and resources; an MCP-compatible host, such as a chat client, a coding assistant, or a custom agent runtime, discovers those tools and calls them on the model's behalf. The value of the standard is that it replaces one-off, per-source integrations with a single interface the model already knows how to use.
For onchain data, an MCP server turns "look up this address" or "count these transfers" into tools the agent calls directly, getting structured results back. SQD provides an open-source MCP server (subsquid-labs/portal-mcp-server) that wraps the SQD Portal API; the MCP server documentation covers connecting it to a host such as Claude, Cursor, or VS Code.
The point of the wrapper is that the agent works in questions, not request formats. A prompt like "how many USDC transfers above 1M moved on Base in the last day" becomes a tool call; the server resolves it to a Portal query against the right dataset and hands back rows the model can read, rather than the topics-and-data hex a node would return. The shape of that underlying query is in the next section.
4. On-demand vs pre-indexed
Onchain data reaches an agent through one of two pipelines, and they trade off in opposite directions.
On-demand. The agent issues a query at runtime, for example through an MCP tool or a direct API call, and gets fresh data back. It is the least to set up, always current, and well suited to open-ended questions where you cannot predict the access pattern. The costs are latency on every call and a dependency on the query service being available at the moment the agent decides.
A direct on-demand fetch is an ordinary Portal request. The USDC-on-Base lookup from the previous section is a logs query filtered to the Transfer event:
POST https://portal.sqd.dev/datasets/base-mainnet/stream
Accept: application/x-ndjson
{
"type": "evm",
"fromBlock": 20000000,
"toBlock": 20000500,
"logs": [{
"address": ["0x833589fcd6edb6e08f4c7c32d4f71b54bda02913"],
"topic0": ["0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef"]
}],
"fields": { "log": { "topics": true, "data": true, "transactionHash": true } }
}
The agent (or the MCP tool in front of it) decodes each row into a typed Transfer with a named sender, recipient, and amount. That decoded JSON is what enters the model's context; the hex never does.
Pre-indexed. The agent reads from a dataset you maintain ahead of time: a warehouse table, a cache, or a vector store. Reads are fast and deterministic, which suits known access patterns and high request volumes. The costs are the pipeline you run to keep the dataset current and the staleness window between updates. Building that pipeline is ordinary indexing work, covered across the blockchain data API guide.
Most production agent systems use both: pre-index the hot path the agent hits constantly, and query on demand for the long tail it hits rarely.
5. Where it gets hard
- Context limits. Onchain history is large and a context window is small. You cannot pour a contract's full history into a prompt; the agent needs targeted queries that return only what the current step requires.
- Freshness. State changes every block. An agent acting on stale data acts wrongly, so the freshness target of the data layer is part of the agent's correctness.
- Verifiability. An agent should be able to cite what it acted on, with block numbers and transaction hashes, so a decision can be audited after the fact rather than taken on trust.
- Multi-chain. An agent reasoning across chains needs one normalized interface, not a separate integration per network, the problem described in multi-chain indexing.
- Cost and determinism. Repeated live queries add latency and expense; caching and pre-indexing keep an agent loop affordable and repeatable.
6. Onchain data for agents with SQD
SQD delivers decoded, typed onchain data as JSON across the networks it indexes, which is the format an agent needs. Agents reach it two ways. The open-source SQD MCP server exposes the SQD Portal to any MCP-compatible host for the on-demand path. For custom agent loops or pre-indexed datasets, the SQD Portal API and the SDKs deliver the same decoded data into your own store.
For the broader picture of building agent integrations, see the AI development documentation, and the AI agents solution page for how teams wire this into a running agent.
Frequently asked questions
What does an AI agent need from blockchain data?
What is the Model Context Protocol (MCP)?
Can an LLM read raw blockchain data?
On-demand or pre-indexed data: which does an agent need?
Does SQD have an MCP server?
Related guides
Building an agent that reads onchain data?
See the MCP server and the decoded-JSON data layer behind agent workflows on the AI agents solution page.