NEWs & publications
Layered AI Defenses Have Holes: Vulnerabilities and Key Recommendations
July 2, 2025
defense-in-depth
STACK: Adversarial Attacks on LLM Safeguard Pipelines
stack-adversarial-attacks-on-llm-safeguard-pipelines
ClearHarm: A more challenging jailbreak dataset
June 23, 2025
clearharm-a-more-challenging-jailbreak-dataset
ClearHarm: A more challenging jailbreak dataset
clearharm-a-more-challenging-jailbreak-dataset
Does Robustness Improve with Scale?
July 23, 2024
does-robustness-improve-with-scale
Exploring Scaling Trends in LLM Robustness
exploring-scaling-trends-in-llm-robustness
STACK: Adversarial Attacks on LLM Safeguard Pipelines
July 2, 2025
stack-adversarial-attacks-on-llm-safeguard-pipelines
Layered AI Defenses Have Holes: Vulnerabilities and Key Recommendations
defense-in-depth
ClearHarm: A more challenging jailbreak dataset
June 23, 2025
clearharm-a-more-challenging-jailbreak-dataset
ClearHarm: A more challenging jailbreak dataset
clearharm-a-more-challenging-jailbreak-dataset
Exploring Scaling Trends in LLM Robustness
July 26, 2024
exploring-scaling-trends-in-llm-robustness
Does Robustness Improve with Scale?
does-robustness-improve-with-scale
Inverse Scaling: When Bigger Isn't Better
June 15, 2023
inverse-scaling-when-bigger-isnt-better