Stefan Heimersheim

Member of Technical Staff

Previously at Apollo Research, he conducted foundational work in mechanistic interpretability—including parameter decomposition methods and studies of activation plateaus—as well as applied projects using interpretability to detect deception in LLMs.

He holds a PhD in Astronomy from the University of Cambridge, where he focused on 21 cm cosmology and Bayesian inference. He is a co-organiser of the NeurIPS 2025 Mechanistic Interpretability workshop.

NEWs & publications

No items found.
Towards Automated Circuit Discovery for Mechanistic Interpretability
July 4, 2023
towards-automated-circuit-discovery-for-mechanistic-interpretability