Oskar Hollinsworth

Research Resident

Oskar Hollinsworth is a research resident at FAR.AI, working on scaling laws for adversarial robustness. Previously he studied how sentiment is represented in language models under Neel Nanda. This work was presented at NeurIPS ATTRIB 2023 and TAIS 2024. Oskar had a first career as an algorithmic trader at Susquehanna International Group, Dublin.

NEWs & publications

Layered AI Defenses Have Holes: Vulnerabilities and Key Recommendations
July 2, 2025
defense-in-depth
STACK: Adversarial Attacks on LLM Safeguard Pipelines
stack-adversarial-attacks-on-llm-safeguard-pipelines
ClearHarm: A more challenging jailbreak dataset
June 23, 2025
clearharm-a-more-challenging-jailbreak-dataset
ClearHarm: A more challenging jailbreak dataset
clearharm-a-more-challenging-jailbreak-dataset
Does Robustness Improve with Scale?
July 23, 2024
does-robustness-improve-with-scale
Exploring Scaling Trends in LLM Robustness
exploring-scaling-trends-in-llm-robustness
STACK: Adversarial Attacks on LLM Safeguard Pipelines
July 2, 2025
stack-adversarial-attacks-on-llm-safeguard-pipelines
Layered AI Defenses Have Holes: Vulnerabilities and Key Recommendations
defense-in-depth
ClearHarm: A more challenging jailbreak dataset
June 23, 2025
clearharm-a-more-challenging-jailbreak-dataset
ClearHarm: A more challenging jailbreak dataset
clearharm-a-more-challenging-jailbreak-dataset
Exploring Scaling Trends in LLM Robustness
July 26, 2024
exploring-scaling-trends-in-llm-robustness
Does Robustness Improve with Scale?
does-robustness-improve-with-scale

publications:

No studies available yet.