Stefan Heimersheim

Member of Technical Staff

Google DeepMind

Stefan was a Member of Technical Staff at FAR.AI until February 2026. Previously at Apollo Research, he conducted foundational work in mechanistic interpretability—including parameter decomposition methods and studies of activation plateaus—as well as applied projects using interpretability to detect deception in LLMs.

He holds a PhD in Astronomy from the University of Cambridge, where he focused on 21 cm cosmology and Bayesian inference. He is a co-organiser of the NeurIPS 2025 Mechanistic Interpretability workshop.

Concept Influence: Leveraging Interpretability to Improve Performance and Efficiency in Training Data Attribution
February 19, 2026
concept-data-attribution-02-2026
Concept Influence: Leveraging Interpretability to Improve Performance and Efficiency in Training Data Attribution
concept-influence-leveraging-interpretability-to-improve-performance-and-efficiency-in-training-data-attribution
Concept Influence: Leveraging Interpretability to Improve Performance and Efficiency in Training Data Attribution
February 19, 2026
concept-influence-leveraging-interpretability-to-improve-performance-and-efficiency-in-training-data-attribution
Concept Influence: Leveraging Interpretability to Improve Performance and Efficiency in Training Data Attribution
concept-data-attribution-02-2026
Training Reliable Activation Probes With a Handful of Positive Examples
September 30, 2025
training-reliable-activation-probes-with-a-handful-of-positive-examples
Transformers Don’t Need LayerNorm at Inference Time: Scaling LayerNorm Removal to GPT-2 XL and Implications for Mechanistic Interpretability
September 30, 2025
transformers-dont-need-layernorm-at-inference-time
Compressed Computation is (probably) not Computation in Superposition
December 6, 2025
compressed-computation-is-probably-not-computation-in-superposition
Towards Automated Circuit Discovery for Mechanistic Interpretability
July 4, 2023
towards-automated-circuit-discovery-for-mechanistic-interpretability