No items found.
TamperBench: Systematically Stress-Testing LLM Safety Under Fine-Tuning and Tampering
February 6, 2026
tamperbench-systematically-stress-testing-llm-safety-under-fine-tuning-and-tampering
Accidental Misalignment: Fine-Tuning Language Models Induces Unexpected Vulnerability
May 22, 2025
accidental-misalignment-fine-tuning-language-models-induces-unexpected-vulnerability