Concepts · 11 min read

Compliance data for crypto: what regulators and investigators look at

A transaction monitoring system can only act on what its data shows it, and most data sources show only event logs. Logs miss the way funds actually move: native transfers, internal contract calls, and direct storage writes. This guide is about the data underneath compliance work. It walks through following funds through a transaction's full call tree with real trace and state-diff queries, then covers where the onchain record stops and attribution and the Travel Rule begin. It is not legal advice.

Updated 2026-06-04 · By the SQD team

1. Compliance data, and the gap that bites

Crypto compliance data has two halves. The first is the onchain record: the verifiable history of transfers, calls, and state changes that any node can independently confirm. The second is attribution: the off-chain intelligence that links an address to a real-world entity. A compliance program needs both, because the chain tells you what happened but not who was behind it.

The half that trips teams up is the first one, and the reason is narrower than it sounds. Most data sources expose event logs: the Transfer and similar events a contract emits. Logs are convenient, indexed, and easy to query, so monitoring tools are often built on them alone. The problem is that a determined actor moves value in ways that emit no log, and those are precisely the movements an investigation cares about. The rest of this guide is about closing that gap with data that does capture them: execution traces and state diffs.

2. Why event logs miss the money

An event log exists only when a contract's code calls LOG for it. Nothing forces a value movement to emit one. Three everyday cases leave no usable log:

  • Native value. ETH (or any chain's base asset) moving between accounts emits no ERC-20 Transfer log. A contract that forwards ETH to an address records nothing in the logs unless it was written to.
  • Internal calls. When a contract calls another contract during a transaction, that is an internal call in the EVM call tree. A withdrawal routed through several contracts is a single top-level transaction; the intermediate hops are internal calls, and most emit no log of their own.
  • Direct storage writes. A contract can credit or debit an internal balance ledger by writing storage directly. No Transfer, sometimes no value-bearing call either, just a changed storage slot.

There is a concrete trap here that catches teams building their own pipeline. Filtering transactions by recipient looks like it should capture everything that touched an address:

{
  "type": "evm",
  "fromBlock": 19000000,
  "toBlock": 19000500,
  "transactions": [{ "to": ["0x...address under investigation..."] }],
  "fields": { "transaction": { "hash": true, "from": true, "value": true } }
}

This returns transactions sent directly to the address and misses every internal call that reached it through another contract. For monitoring, the hops you cannot see this way are often the ones that matter: a mixer, a nested router, a bridge adapter. The gap is not hypothetical. Take a heavily-used address, the WETH contract (0xC02a…756Cc2) for one, and count both ways over the same 20 blocks (19,000,000 to 19,000,020): it is the top-level to of 29 transactions, but it turns up in 2,370 internal-call traces. The part a transactions-only view cannot see outnumbers the part it can by more than eighty to one. The fix is to stop querying transactions and start querying traces.

3. Following the money with traces

A trace is one operation executed inside a transaction, including every internal call. One transaction can produce dozens of traces. They are the canonical answer to "where did the funds actually go," and SQD's Portal exposes them as a first-class data type. The query below filters the trace stream for the address under investigation, catching it whether it received value or sent it, at any depth, and joins each hop back to its transaction, against the Ethereum dataset:

POST https://portal.sqd.dev/datasets/ethereum-mainnet/stream
Accept: application/x-ndjson

{
  "type": "evm",
  "fromBlock": 19000000,
  "toBlock": 19000500,
  "traces": [
    { "type": ["call"], "callTo": ["0x...address under investigation..."], "transaction": true },
    { "type": ["call"], "callFrom": ["0x...address under investigation..."], "transaction": true }
  ],
  "fields": {
    "transaction": { "hash": true, "from": true, "to": true, "value": true },
    "trace": {
      "type": true,
      "traceAddress": true,
      "callFrom": true,
      "callTo": true,
      "callValue": true,
      "callCallType": true,
      "error": true
    }
  }
}

The response is newline-delimited JSON, one line per block. Each hop that touches the address comes back tied to its transaction, and traceAddress places it in the call tree: [] is the top-level call, [0] the first internal call, [0,0] a call made inside that one, and so on. callValue is the value moved at each hop, and callCallType separates an ordinary call from a delegatecall or a read-only staticcall. The immediate counterparty of a matched hop, its callFrom or callTo, is the next address to follow. A withdrawal that fans out through several contracts looks like this under one transaction, with the filter surfacing the hop that lands on the address:

Transaction                           traceAddress []
|- call  [0]       router    -> contract A    value X
|  |- call  [0,0]  contract A -> contract B   value X
|  |  \- call [0,0,0]  contract B -> recipient  value X
\- call  [1]       router    -> fee sink      value Y

Two details worth knowing before you build on this. Traces carry a transactionIndex rather than a transaction hash, which is why the query above joins the transaction itself, so each hop ties back to a hash. And the response nests the action data: the field requested as callFrom arrives under action.from, callTo under action.to. With those in hand, the full path of funds through a transaction is reconstructable, not inferred.

4. State diffs: flows with no event

Traces cover internal calls, but the hardest movements to follow leave no value-bearing call at all: a contract that maintains an internal balance ledger and moves funds by writing a storage slot. A state diff records, for a transaction, which storage slots changed and their previous and new values. It is the lowest-level evidence of what a transaction did, below logs and below calls.

State diffs ride on a transaction query: a transactions filter is the only one that carries traces, logs, and state diffs together, since a trace filter cannot join them. Adding the stateDiffs join returns a matched transaction's complete footprint in one request. For a transaction sent straight to the address, that filter is its to:

{
  "type": "evm",
  "fromBlock": 19000000,
  "toBlock": 19000500,
  "transactions": [{
    "to": ["0x...address under investigation..."],
    "traces": true,
    "logs": true,
    "stateDiffs": true
  }],
  "fields": {
    "transaction": { "hash": true, "from": true, "to": true, "value": true, "input": true },
    "trace": { "type": true, "traceAddress": true, "callFrom": true, "callTo": true, "callValue": true },
    "log": { "address": true, "topics": true, "data": true },
    "stateDiff": { "address": true, "key": true, "kind": true, "prev": true, "next": true }
  }
}

The stateDiff selection returns each changed slot with its prev and next value and a kind flag marking it added, modified, or removed. One row that comes back, a USDC balance moving:

{
  "address": "0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48",
  "key":  "0xd8a9b02cd36b54a798f62669d55d1e39cd3e2ef2f355be927a2823214d182bdc",
  "kind": "*",
  "prev": "0x00000000000000000000000000000000000000000000000000000007243adebb",
  "next": "0x00000000000000000000000000000000000000000000000000000007420843bb"
}

That is one storage slot on the USDC contract. Decoded, prev and next are a balance of 30,672.608955 and 31,172.608955 USDC, a 500 USDC transfer recorded as a storage write. Follow callValue through the traces and nothing moves here, because a token transfer carries no ETH, so the amount lives only in this diff. State diffs sit below logs and calls, which is how they also catch an internal-ledger write that emits no log at all.

For an investigator, the result is the difference between "a transfer happened" and "here is everything this transaction touched": the top-level call, every internal hop, every emitted event, and every storage change, joined under one transaction hash. That completeness is also what makes a record defensible later, because it can be reproduced and checked against the chain rather than taken on trust.

One anchoring note for the internal-call cases. When the trace filter surfaced the address deep inside a transaction rather than as its top-level recipient, that transaction's to is another contract, so filtering by to will not select it. Re-request its footprint by the from and block the trace query already returned, an anchor that holds however value reached the address.

5. Attribution, sanctions, and the Travel Rule

Tracing funds is only half the job. The onchain record shows what an address did; it does not say who controls it. Attribution, the work of linking addresses to exchanges, services, and named actors, is off-chain intelligence assembled from many sources, and it is the core product of specialist vendors such as Chainalysis, Elliptic, and TRM Labs. They sit on top of the data: the quality of their screening depends on having a complete and correct picture of the onchain activity in the first place.

Sanctions screening checks addresses against published lists, such as the OFAC list of specially designated nationals. KYT (Know Your Transaction) is the ongoing version: scoring each transfer by its exposure, for instance how few hops separate it from a flagged address, which is exactly the kind of question the trace data above is built to answer.

The Travel Rule, from FATF Recommendation 16, asks a VASP to obtain, hold, and transmit originator and beneficiary information so it travels with a transfer between providers. FATF suggests a de minimis threshold near USD or EUR 1,000, but thresholds and required fields vary by jurisdiction, and the identity data is exchanged off-chain through dedicated Travel Rule protocols. What the chain provides is the transfer: the asset, the amount, the addresses, and the timestamp. Compliance tooling lines the two up.

6. Compliance data with SQD

SQD provides the onchain data layer this work runs on, not the screening or attribution. The queries above are the actual interface: logs, traces, and state diffs for a transaction, decoded and typed, returned together in one request from the Portal. Because traces and state diffs sit alongside logs, the movements a log-only feed misses are in scope rather than invisible.

The same query shape runs against every dataset SQD indexes. Following funds onto another chain is a one-line change, swapping ethereum-mainnet for base-mainnet or arbitrum-one in the endpoint, across the networks listed at sqd.dev/chains, with full history from genesis. For a production workflow you would not hand-write these requests; the Squid and Pipes SDKs stream the same data into your own store, where it can be exported as structured, timestamped records for an AML or KYT system to consume.

Attribution, risk scoring, and the regulatory judgment calls stay with the specialist tools and the compliance team on top of this data. For how the trace-level data is structured and exported for those tools, see the compliance solution page.

Frequently asked questions

Why are event logs not enough for transaction monitoring?
A log is only written when a contract's code chooses to emit one. Native value (ETH) moving emits no ERC-20 Transfer log; value moved by a contract mid-execution (an internal call) emits no log; and a contract can change balances by writing storage directly, with no event at all. A monitoring system built only on logs is blind to all three, which are exactly the paths used to obscure a flow of funds. A complete picture needs execution traces and state diffs in addition to logs.
How do I get internal transactions (internal calls) for an address?
Internal calls live in execution traces, not in the transaction list or the logs. With SQD's Portal you query the traces data type (or join traces onto a transaction query) and filter by callTo or callFrom for the address. Each matched transaction returns its full call tree, with a traceAddress that reconstructs the order of hops and a callValue for the amount moved at each one. Filtering transactions by "to" alone misses these internal calls entirely.
What is a state diff and why does compliance need it?
A state diff is the record of which storage slots a transaction changed, with the previous and new value of each. It captures value movement that never emits an event, for example a contract that maintains an internal balance ledger and updates it by writing storage directly. For an investigator, state diffs are how you reconstruct flows that bypass Transfer events, and how you confirm what a transaction actually did rather than what it announced.
What is the Travel Rule?
The Travel Rule comes from Financial Action Task Force (FATF) Recommendation 16. It asks virtual asset service providers (VASPs) such as exchanges to obtain, hold, and transmit information about the originator and beneficiary of a transfer, so identifying information travels with the value. FATF suggests a de minimis threshold (around USD or EUR 1,000) below which lighter requirements may apply, but the exact thresholds and fields vary by jurisdiction. The transfer itself is onchain; the identity information is held by the VASPs and exchanged through dedicated Travel Rule protocols.
What is KYT (Know Your Transaction)?
KYT is the transaction-side counterpart to KYC. It screens transactions rather than verifying a customer once: scoring risk from counterparties and exposure (for example, how few hops separate a transfer from a sanctioned address) and flagging activity for review or a suspicious activity report. Established KYT and analytics vendors include Chainalysis, Elliptic, and TRM Labs, which pair onchain data with proprietary attribution.
Can onchain data alone identify who owns an address?
No. Onchain data shows what an address did, not who controls it. Attribution (linking an address to an exchange, a service, or a named actor) is off-chain intelligence built from many sources, and it is the core product of specialist analytics vendors. The onchain record is the verifiable substrate; attribution is a separate layer added on top.

Building compliance or monitoring tooling?

See how trace-level data and audit-ready exports feed transaction monitoring on the compliance solution page.