NEWs & publications
No items found.
                  Open Problems in Mechanistic Interpretability
                      January 27, 2025
                      open-problems-in-mechanistic-interpretability
                      Inverse Scaling: When Bigger Isn't Better
                      June 15, 2023
                      inverse-scaling-when-bigger-isnt-better
                      Improving Code Generation by Training with Natural Language Feedback
                      March 28, 2023
                      improving-code-generation-by-training-with-natural-language-feedback
                      Pretraining Language Models with Human Preferences
                      February 16, 2023
                      pretraining-language-models-with-human-preferences
                      