Mechanistic Interpretability Hub

A living summary of key findings, open questions, and latest research in understanding how neural networks work internally

Last updated: Loading...

Papers

Latest research from arXiv, conferences, and research labs. Updated automatically.

Key Findings

Major discoveries that have shaped our understanding of how neural networks compute internally.

Open Questions

Critical unsolved problems that the field is actively working on.

Core Techniques

The main methodological approaches used in mechanistic interpretability research.

Connections to Neuroscience

Mechanistic interpretability shares deep roots with neuroscience. Understanding these parallels can illuminate both fields and transfer insights between them.

Resources

Essential reading, tools, and communities for mechanistic interpretability.

Essential Reading

Research Groups

Tools & Libraries

Communities

Stay Updated

This site automatically fetches the latest mechanistic interpretability papers from arXiv and major research labs.

Automated Updates

Papers are fetched daily via GitHub Actions from arXiv, Semantic Scholar, and major research blogs.

Contribute

Found a paper we missed? Open an issue or submit a PR to add it.

RSS Feed

Subscribe to our RSS feed to get notified of new additions.