NEWs & publications
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
October 19, 2023
codebook-features-sparse-and-discrete-interpretability-for-neural-networks
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
codebook-features-sparse-and-discrete-interpretability-for-neural-networks
Open Problems in Mechanistic Interpretability
January 27, 2025
open-problems-in-mechanistic-interpretability
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
October 27, 2023
codebook-features-sparse-and-discrete-interpretability-for-neural-networks
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
codebook-features-sparse-and-discrete-interpretability-for-neural-networks