Contrarian4 min read

The problem with "explainable AI"

"Explainable AI" promised that we could ask a model why it did something and get a satisfying answer. It has not delivered. The regulators are frustrated. The customers are no more confident. The substitute that does work — attribution, naming the human accountable for the action — is the one infrastructure can deliver.

Why XAI underperformed

Three reasons. The state of the art (LIME, SHAP, attention visualizations, surrogate models) produces explanations that do not survive distribution shift; the explanation made yesterday no longer applies today. Many high-stakes decisions are not, in fact, model decisions; they are pipelines where a model contributes one signal and the explanation must cover the entire pipeline. And the audience for "explanations" — regulators, jurors, customers — wants something different from what the field can produce: not a feature-importance vector, but a sentence.

Why attribution lands

Attribution does not promise to explain why the model made a choice. It promises to identify the human who is accountable for the choice having been delegated to the model in the first place. That sentence — "Vishal, on May 4, signed a delegation authorizing this model to make this class of decision in this scope, and the action was taken at this timestamp" — is what regulators want. It is also what jurors understand, what customers can act on, and what insurers can underwrite.

Where each is right

XAI is right when the question is debugging — engineers asking the model what it learned. Attribution is right when the question is responsibility — regulators, jurors, and customers asking who must answer for the action. Both are necessary; only one of them is what the regulator-grade audience has been asking for.

What "explainable" was a placeholder for

Trust. The deeper desire under "make it explainable" was always "make it accountable." Accountability is provable; explainability is asymptotic. The category was framed wrong — partly because the early framers were ML researchers rather than auditors, partly because the marketing budget around XAI was large, partly because attribution was not technically possible at scale until the kind of infrastructure Manav and peers are building was deployable.

What this means for product

If you sell "explainable AI" today, you are selling a feature whose buyers are increasingly turning toward attribution. The lift to add attribution is moderate (delegation tokens, audit trail, signed actions); the lift to ship XAI that survives audit is high. Spend the budget where it lands.

Common objections

The strongest counter-arguments we have heard. The incumbent will catch up — possibly inside their boundary; the cross-platform shape is architecturally hard for them. The category is too narrow — we believe it broadens as agent autonomy compounds; we may be wrong; the data over the next year will tell.

Frequently asked questions

What are the strongest counter-arguments? The two we hear most: (1) the incumbent will eventually ship this, and (2) the category is too narrow to support a category-defining company. We address both head-on; we believe the incumbent's architecture cannot ship this without a rebuild, and we believe the category broadens as agent autonomy compounds.

Are we ignoring legitimate criticism? We try not to. The honest criticisms — slow adoption, immature SDKs in some languages, unclear regulator response — are documented openly. We answer with progress, not with marketing.

What would make us change our mind? Three signals. A major incumbent shipping a comparable cross-platform delegation primitive. A regulator explicitly preempting the category with a different spec. A customer cohort showing they prefer the platform-bound alternative even when the audit trail is broken. None of those have appeared.

Where to start

For the steel-manned counter-position, read ai safety without identity. For the alternative we agree could win, see audit trail design. We do not need to be right for the category to be real.

Why interpretability research keeps coming back to attribution

The interpretability research community has spent a decade trying to explain model decisions. The progress is real, the publishable insights compound, and the field has earned its place in the safety stack. But every major interpretability advance lands in front of a regulator who asks the same question: who is accountable? Explanations of internal model state do not answer that question. Attribution does. The accountable party is not the model; it is the human who delegated the action to the model. This is why every interpretability paper presented to a regulator eventually folds into an attribution layer, and why the long-run win for the interpretability field will be supplying high-fidelity input to attribution decisions, not replacing them. Explanation is research; attribution is regulation. The two compose; they do not substitute. The interpretability researchers who recognize the composition early are doing the most consequential work the field has produced.

Explanation is the answer to "why." Attribution is the answer to "who." Audits care about both; only one of them is shippable today.