Dillon Bowen

Research Scientist

Dillon Bowen is a Research Scientist at FAR.AI focused on understanding catastrophic risks from frontier AI models.

He completed his PhD in Decision Processes at the Wharton School of Business under Philip Tetlock, focusing on statistics, experiment design, and forecasting. Dillon was previously the principal data scientist at a London-based startup and collaborated on AI safety research through ML Alignment and Theory Scholars (MATS) and UC Berkeley’s Center for Human-Compatible AI (CHAI).

NEWs & publications

​​A Toolkit for Estimating the Safety-Gap between Safety Trained and Helpful Only LLMs
July 31, 2025
safety-gap-toolkit
The Safety Gap Toolkit: Evaluating Hidden Dangers of Open-Source Models
the-safety-gap-toolkit-evaluating-hidden-dangers-of-open-source-models
Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google
February 4, 2025
illusory-safety-redteaming-deepseek-r1-and-the-strongest-fine-tunable-models-of-openai-anthropic-and-google
GPT-4o Guardrails Gone: Data Poisoning & Jailbreak-Tuning
October 31, 2024
gpt-4o-guardrails-gone-data-poisoning-jailbreak-tuning
Data Poisoning in LLMs: Jailbreak-Tuning and Scaling Laws
scaling-laws-for-data-poisoning-in-llms
The Safety Gap Toolkit: Evaluating Hidden Dangers of Open-Source Models
July 8, 2025
the-safety-gap-toolkit-evaluating-hidden-dangers-of-open-source-models
Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google
February 4, 2025
illusory-safety-redteaming-deepseek-r1-and-the-strongest-fine-tunable-models-of-openai-anthropic-and-google
Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google
illusory-safety-redteaming-deepseek-r1-and-the-strongest-fine-tunable-models-of-openai-anthropic-and-google
Data Poisoning in LLMs: Jailbreak-Tuning and Scaling Laws
August 6, 2024
scaling-laws-for-data-poisoning-in-llms
GPT-4o Guardrails Gone: Data Poisoning & Jailbreak-Tuning
gpt-4o-guardrails-gone-data-poisoning-jailbreak-tuning