Developer4 min read

Identity verification at 100k RPS

A relying party that can verify Manav delegations at 100,000 requests per second is roughly the difference between "we use this in production" and "we use this in compliance demos." Here is the benchmark, the numbers, and the cache layer that closes the gap.

The benchmark setup

Open-source repo at github.com/manav-id/bench-100k. Single c7gn.4xlarge box, Ubuntu 24.04, Go 1.23. Workload: a synthetic mix of 70% delegation verifications, 20% revocation-status checks, 10% audit-event publishes. Each request is a real Ed25519 verification, not a stub.

Hot-path numbers

Cold verify (no cache): 1.42 ms median, 3.1 ms p99. Throughput per core: 7,800 RPS. Sixteen cores saturate at ~118,000 RPS without cache; CPU sits at 92%. With the verification cache enabled, hot path drops to 18 µs median, 96 µs p99; the same box clears 480k RPS at 41% CPU.

Why a cache is honest

The signature on a delegation does not change for the lifetime of the delegation. If the human signed for 8 hours, the bytes are stable for 8 hours. Caching the verification result by (delegation_id, scope, params_hash, current_revocation_epoch) is correct, not a shortcut. The revocation epoch invalidates the cache the moment the human revokes — so the cache cannot serve stale authority.

The revocation channel

The relying party subscribes to a single Server-Sent Events stream. Revocations are pushed in under 200 ms from click to broadcast. Each revocation increments a per-human epoch counter; cache keys include the epoch, so a single integer change invalidates everything that human signed. No per-request lookup, no fan-out fan-in pain.

Memory profile

The cache holds roughly 96 bytes per entry. A box doing 100k RPS with a 90-second TTL holds about 9M entries — under 1 GB. Production-realistic.

The cold path

Cold verifications still hit Ed25519 directly. The bench uses Go's stdlib ed25519.Verify; switching to circl/sign/ed25519 shaves another 20%. For relying parties on ARM64 with hardware Ed25519, the floor is closer to 0.7 ms median.

Audit publish

Audit events are emitted asynchronously to a local ring buffer and shipped to the audit ledger via gRPC. Publishing is decoupled from the request path; the worst-case audit latency is 800 ms p99 under load, but it does not block verification.

What the bench proves

A single mid-size box can sit on the hot path of a Stripe-scale service and still verify every action against a Manav delegation. There is no horizontal scaling story to tell because the vertical story already covers most production loads. We publish the bench because anybody can rerun it.

Common objections

Engineers push back on three things. Latency — the cache makes verification 18 µs hot-path, fine for any production system. Vendor lock-in — the protocol is open, the spec is published, the reference implementation is forkable. Adding another auth dance — the integration is twelve lines and middleware, not a new platform to manage.

Frequently asked questions

What is the runtime cost? Single-digit milliseconds per tool call when the verification cache is warm. Cold verification is 1–2 ms. Both numbers are small relative to the LLM round-trip the agent is already paying.

Does it work with our existing agent framework? Yes. The protocol is host-agnostic. SDKs ship for Python, Go, Node, Rust, and TypeScript; integrations exist for LangChain, CrewAI, AutoGen, and the Claude Agent SDK. Anything that calls a tool can present a delegation.

What happens to delegations when an engineer leaves? They die at the human's offboarding. The IdP de-provisions the human; the device key is rotated; every active delegation that human signed is invalidated within 200 ms. No service-account graveyard for the new owner to clean up six months later.

Where to start

Hands-on next: cross platform agent identity ships in twelve lines; webhooks not polls adds the operational layer once you have the basics. Both link to working repos; clone, integrate, run the bench.

The bottleneck that surprised us

We sized the system for 100,000 verifications per second on the assumption that the cryptographic verification — signature checks, scope validations — would be the hot path. The benchmarking proved otherwise. The cryptography ran comfortably; the bottleneck was the audit-log write. Append-only logs at 100k rps require careful index management, careful disk IO patterns, and careful network back-pressure handling. We rebuilt the audit layer twice before the numbers held under sustained load. The lesson generalizes to any compliance-grade system: the evidence you produce is heavier than the decision you make, because the decision is computational and the evidence is durable. Builders who size only for the decision discover the audit layer at peak load, which is the worst time to discover it. We learned this and published the architecture so future builders skip the lesson. The cryptography will not be your bottleneck. The audit will.

If your identity layer makes you choose between performance and audit, you don't have an identity layer. You have a security theater.