Compliance4 min read

Audit trail design for AI agents

What every AI agent action should log, how the log should be structured, and how to make the log itself credible to an adversarial auditor. The reference template.

The four anchors a regulator will read

Strip away vendor-specific noise and every regulator's question is the same: identity, authority, action, audit. Each entry in your audit trail should answer all four.

The minimum entry

{
 "ts": "2026-05-03T14:22:01.412Z",
 "human": {
 "did": "did:manav:0x9a3f7e1d...",
 "verified_at": "2026-05-03T08:42:03Z",
 "method": "biometric+device",
 "competence_certs": ["did:manav:cert:trader-license-FINRA-3110"]
 },
 "delegation": {
 "id": "delegation-7d2e",
 "scope": ["equities:large-cap:<5M-notional"],
 "ttl_remaining": "3h 47m",
 "magnitude_remaining": "USD 4.2M"
 },
 "agent": {
 "id": "agent:claude:opus-4-7:run-7d2e",
 "framework": "mcp"
 },
 "action": {
 "tool": "exchange.submit_order",
 "params_hash": "sha256:8f1e...",
 "outcome": "filled",
 "result_hash": "sha256:91a2..."
 },
 "supervision": {
 "second_human": "did:manav:0xc104...",
 "verified_at": "2026-05-03T14:22:00Z",
 "role": "supervisor"
 },
 "signature": "ed25519:..."
}

Eight fields, each verifiable independently, each meeting one of the four anchors. The shape can be JSON, Protobuf, OpenTelemetry - the encoding does not matter. The fields do.

Tamper-evidence

An audit log that the operator can edit is not an audit log. Three patterns make logs credible:

Hash chaining. Each entry references the hash of the previous entry. Tampering with one entry invalidates everything after it.
Merkle tree commitment. Periodically commit the root hash to an external system (a public chain, a notary service, a partner organization) so the operator cannot quietly rewrite history.
Independent verifiability. The auditor must be able to verify the chain without trusting your infrastructure. Public Merkle root + signed log = verifiable from any copy.

What "completeness" means

Auditors will probe for gaps. The standard tactic: ask for the log of a specific time window, look for monotonically increasing sequence numbers, and check that any gap is explained. Three controls keep completeness defensible:

Sequence numbers per agent and per relying party.
Heartbeat entries during quiet periods.
Failure-to-log alerts (paged when the log writer cannot record an event).

Replay resistance

Tokens used in attestation must be unique per call. Replay-resistant nonces, timestamps with skew tolerance, and per-action signature scopes prevent an attacker from forging actions retroactively into a clean-looking log.

Retention and access

HIPAA: 6 years. SOX: 7 years. AI Act: at least 6 months for log records (Art 12), longer for technical documentation. PIPL: 3+ years depending on category. Build retention to the longest applicable; do not assume your AI vendor's default is enough.

Access to the log itself should be a least-privilege control with its own audit. Auditing the auditor is part of credible audit.

Common mistakes

Logs in S3 buckets without object-lock and without external commitment.
Per-team log shapes with no canonical schema.
Missing the second-human field for Article 14-relevant systems.
Storing identity references but not competence certificates.
Recording the agent action but not the result hash, leaving the "what actually happened" question to vendor cooperation.

What auditors love

A live walk-through where you randomly pick a transaction from yesterday, follow the audit log all the way back to the verified human and forward to the result, in under 60 seconds. The chain stays intact, the cryptographic verifications check, and the explanation is clear without your help. This is the demo that ends inquiries quickly.

Common objections

Compliance teams push back with two reasonable concerns. Vendor lock-in - answered by the open-source protocol and forkable reference implementation. Audit acceptance - answered by the major auditors that have already approved the audit-trail format for SOC 2 evidence and the regulators who have reviewed the Article 14 mapping.

Frequently asked questions

What is the penalty exposure if we ignore this? Material. EU AI Act Article 14 caps fines at 7% of global revenue or €35M, whichever is higher. SOC 2 audit failures jeopardize enterprise procurement. The cost of the audit-trail layer is small relative to either.

Do we need to be in the EU for this to matter? No. Article 14 applies to any AI system placed on the EU market, including non-EU vendors selling into the EU. Most US enterprises with European customers are in scope. The same controls satisfy emerging US sectoral rules and India's DPDPA.

How long does compliance take to set up? Two weeks for an instrumented stack. Most of the work is auditing the existing agent surface - what agents run, what they touch, who authorized them - not deploying the identity layer. The protocol integrates in twelve lines; the policy work takes longer.

Where to start

Pair this with ai act article 14 playbook for the cross-jurisdictional view and ciso compliance stack for the audit artifact your auditors expect to see. Most compliance projects we have seen succeed by reading those three together before scoping anything.

The line auditors look for first

The first line every auditor reads in an audit trail is the human DID. Not the timestamp, not the action, not the resource - the human. If the audit trail names a system or an agent without naming the human upstream, the auditor stops reading and writes a finding. This is not a procedural quirk; it is the entire reason audit trails exist. The line satisfies the regulator, the line satisfies the insurer, the line satisfies the court. Designing the trail around that line - putting the human DID first in the row order, making it non-nullable, indexing it for fast retrieval - is the single design decision that compounds across every later requirement. We learned this from auditors before we learned it from regulators. The audit trail that satisfies one auditor satisfies most; the trail that fails the first line fails them all.

The audit log is the product. Everything else is implementation detail.