Brendan Murphy

NEWs & publications

Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google

February 4, 2025

illusory-safety-redteaming-deepseek-r1-and-the-strongest-fine-tunable-models-of-openai-anthropic-and-google

Jailbreak-Tuning: Models Efficiently Learn Jailbreak Susceptibility

jailbreak-tuning-models-efficiently-learn-jailbreak-susceptibility

GPT-4o Guardrails Gone: Data Poisoning & Jailbreak-Tuning

October 31, 2024

gpt-4o-guardrails-gone-data-poisoning-jailbreak-tuning

Data Poisoning in LLMs: Jailbreak-Tuning and Scaling Laws

scaling-laws-for-data-poisoning-in-llms

Jailbreak-Tuning: Models Efficiently Learn Jailbreak Susceptibility

July 15, 2025

jailbreak-tuning-models-efficiently-learn-jailbreak-susceptibility

Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google

February 4, 2025

illusory-safety-redteaming-deepseek-r1-and-the-strongest-fine-tunable-models-of-openai-anthropic-and-google

Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google

illusory-safety-redteaming-deepseek-r1-and-the-strongest-fine-tunable-models-of-openai-anthropic-and-google

Data Poisoning in LLMs: Jailbreak-Tuning and Scaling Laws

August 6, 2024

scaling-laws-for-data-poisoning-in-llms

GPT-4o Guardrails Gone: Data Poisoning & Jailbreak-Tuning

gpt-4o-guardrails-gone-data-poisoning-jailbreak-tuning

July 15, 2025

Our research explores a portfolio
of high-potential agendas.

Our events bring together
global leaders in AI.

Our programs build the field of trustworthy and secure AI

Our research explores a portfolio
of high-potential agendas.

Our events bring together
global leaders in AI.

Our programs build the field of trustworthy and secure AI