Manav.id
Developer4 min read

Building the kill switch

Kill switch design

Most kill switches are theater. They look reassuring on a slide and have never been pulled at production scale. Here is how to build one that survives a 3 a.m. incident.

The four properties of a real kill switch

A kill switch worth the name has four properties. Most production systems satisfy at most two.

The reference design

Manav's kill switch operates in three layers:

  1. Active push (sub-50ms). Webhook fanout to every subscribed relying party. Each receives a signed revocation message and updates its local accept-list immediately.
  2. Pull-back (under 60s). Relying parties refresh the active token set every 60 seconds as a safety net for missed pushes.
  3. Reject-on-failure (immediate). If a relying party's signature verification fails on any subsequent token use, it forces an immediate refresh and rejects the call.

The push covers the hot path. The pull catches drift. The reject-on-failure handles network partitions.

The test you must run

A kill switch you have not tested in production is a hypothesis. Run this drill quarterly:

  1. Pick a non-production agent in your fleet.
  2. At a random moment during its operation, click revoke.
  3. Measure: time to first rejection by every relying party. Time to last rejection. Total revocation latency p50, p99.
  4. Document and improve until p99 is under 1 second across the fleet.

The first time most teams run this drill, they discover relying parties that ignore revocations entirely. The drill is the only way to find them.

The compliance angle

EU AI Act Article 14 expects a stop function that allows the overseer to halt the system. A documented, tested kill switch is the technical artifact that satisfies this expectation. Without the documentation, your auditor cannot conclude the function exists. Without the test, the auditor cannot conclude it works.

What goes wrong in practice

Common objections

Engineers push back on three things. Latency — the cache makes verification 18 µs hot-path, fine for any production system. Vendor lock-in — the protocol is open, the spec is published, the reference implementation is forkable. Adding another auth dance — the integration is twelve lines and middleware, not a new platform to manage.

Frequently asked questions

What is the runtime cost? Single-digit milliseconds per tool call when the verification cache is warm. Cold verification is 1–2 ms. Both numbers are small relative to the LLM round-trip the agent is already paying.

Does it work with our existing agent framework? Yes. The protocol is host-agnostic. SDKs ship for Python, Go, Node, Rust, and TypeScript; integrations exist for LangChain, CrewAI, AutoGen, and the Claude Agent SDK. Anything that calls a tool can present a delegation.

What happens to delegations when an engineer leaves? They die at the human's offboarding. The IdP de-provisions the human; the device key is rotated; every active delegation that human signed is invalidated within 200 ms. No service-account graveyard for the new owner to clean up six months later.

Where to start

Hands-on next: delegation tokens explained ships in twelve lines; ai act article 14 playbook adds the operational layer once you have the basics. Both link to working repos; clone, integrate, run the bench.

Why kill switches need their own audit trail

A kill-switch that activates without an audit trail is functionally identical to no kill-switch at all. The regulator does not credit the activation; the insurer does not credit the activation; the post-incident investigator cannot reconstruct the activation. The audit row is what makes the activation count. The Manav kill-switch design produces three audit rows: who activated it, what scope was revoked, and what actions were in-flight at the moment of activation. The third row matters most. In-flight actions need a deterministic disposition — committed, rolled back, escalated — and the kill-switch row is where the disposition is recorded for the parties downstream. Builders who ship a kill-switch without the in-flight audit row discover the gap during the first incident, when the post-mortem cannot reconstruct what the agent did between activation and termination. The audit is not a feature of the kill-switch. It is the kill-switch.

The kill switch is the most reassuring control in your stack — and the least tested. Both should change.