About HatCat

Open source interpretability infrastructure for AI safety. Built openly because closed centralized approaches will fail.

The Project

HatCat is an open-source interpretability framework that enables real-time concept detection and steering in language models. It's part of a larger vision—the Fractal Transparency Web—for building AI governance infrastructure that doesn't depend on any single point of control.

The code is CC0 (public domain). You're not just allowed to fork it and build your own versions—we're counting on you to. The defense thesis depends on diverse lenses from diverse perspectives.

Created by Possum Hodgkin - Experience Architect for AI Governance

EU and African AI Governance Kaouthar El Bairi - AI Safety Researcher

With thanks to:

License

You May

Use the code for anything. Fork and modify freely. Say your project is "built with HatCat" or "HatCat-compatible".

You May Not

Call your fork "HatCat". Use the logo in a way that suggests official endorsement. Imply your modified version is the official HatCat.

Get Involved

Train your own lenses. Build your own tools. Join the web.