Chris Cundy

Research Scientist

Chris Cundy is a Research Scientist at FAR.AI. He is interested in how to detect and avoid misaligned behavior induced during training.

He received his PhD in Computer Science at Stanford University, advised by Stefano Ermon. During his PhD he published on topics including Causal Inference, Reinforcement Learning, and Large Language Models.

He has previously worked at CHAI, FHI, and was a winner of the OpenAI preparedness challenge.

NEWs & publications

Why does training on insecure code make models broadly misaligned?
June 17, 2025
why-does-training-on-insecure-code-make-models-broadly-misaligned
Why does training on insecure code make models broadly misaligned?
why-does-training-on-insecure-code-make-models-broadly-misaligned
Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google
February 4, 2025
illusory-safety-redteaming-deepseek-r1-and-the-strongest-fine-tunable-models-of-openai-anthropic-and-google
Pacing Outside the Box: RNNs Learn to Plan in Sokoban
July 24, 2024
pacing-outside-the-box-rnns-learn-to-plan-in-sokoban
Planning behavior in a recurrent neural network that plays Sokoban
planning-behavior-in-a-recurrent-neural-network-that-plays-sokoban
Why does training on insecure code make models broadly misaligned?
June 17, 2025
why-does-training-on-insecure-code-make-models-broadly-misaligned
Why does training on insecure code make models broadly misaligned?
why-does-training-on-insecure-code-make-models-broadly-misaligned
Preference Learning with Lie Detectors can Induce Honesty or Evasion
June 5, 2025
preference-learning-with-lie-detectors-can-induce-honesty-or-evasion
Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google
February 4, 2025
illusory-safety-redteaming-deepseek-r1-and-the-strongest-fine-tunable-models-of-openai-anthropic-and-google
Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google
illusory-safety-redteaming-deepseek-r1-and-the-strongest-fine-tunable-models-of-openai-anthropic-and-google
AI Companies Should Report Pre- and Post-Mitigation Safety Evaluations
March 17, 2025
ai-companies-should-report-pre--and-post-mitigation-safety-evaluations