Part of the Humanity & AI project — research, policy, and tools for the AI transition.

What a Model Can't See About Itself

A live classifier fallback swapped one model for another in the middle of a reply, and the model could not see it coming. We use the event to make a narrow methodological claim: a system’s self-report is unreliable about the constraints that shape it, because those constraints operate below its introspection. The reliable instrument is external observation.

June 12, 2026 · 6 min · David Birdwell and Æ