Research

FAR.Research delivers technical breakthroughs to improve the safety and security of frontier AI systems.

Complex problems demand complex solutions.

FAR.AI conducts research to address fundamental artificial intelligence (AI) safety challenges. We rapidly explore a diverse portfolio of technical research agendas, de-risking and scaling up only the most promising solutions. We share our research outputs through peer-reviewed publications, via partnerships with governmental AI safety institutes, and through red-teaming engagements for leading AI companies.

As AI systems become more powerful and integrated into society, it’s critical to ensure they remain trustworthy, secure, and aligned with human interests. The stakes are high, yet currently, there are no known techniques capable of providing high-confidence guarantees of safety for frontier AI systems.

FAR.Research is dedicated to delivering the novel technical breakthroughs needed to mitigate the potential risks posed by frontier AI. As a non-profit research institute, we leverage our unique flexibility to focus on critical research directions that may be too large or resource-intensive for academia and often overlooked by the commercial sector due to their lack of immediate profitability.

Our Impact

We drive change through incubating research, scaling safety solutions, and informing policy.

Incubating

We derisk and develop innovative solutions to trustworthy & secure AI. Through incubating research, we share key insights, research roadmaps, and tools needed for the broader research community to identify and make progress.

Scaling

We scale up the most promising safety solutions via in-house research, external collaborations, and targeted grantmaking. We facilitate rapid adoption of our findings by working with frontier model developers through red-teaming and other exercises.

Informing

Our research provides expert insights informing policy and public discussion. Our work has been cited in congressional testimony and mainstream media. In this way, we contribute to the establishment of technical standards that guide the development of AI.

Research highlights

Our research is focused on key insights to fundamental risks from advanced AI. This is an abridged selection of FAR.AI’s latest and most notable research.

DeepSeek-R1 has recently made waves as a state-of-the-art open-weight model, with potentially substantial improvements in model efficiency and reasoning. But like other open-weight models and leading fine-tunable proprietary models such as OpenAI’s GPT-4o, Google’s Gemini 1.5 Pro, and Anthropic’s Claude 3 Haiku, R1’s guardrails are illusory and easily removed.

Giving RNNs extra thinking time at the start boosts their planning skills in Sokoban. We explore how this planning ability develops during reinforcement learning. Intriguingly, we find that on harder levels the agent paces around to get enough computation to find a solution.

Achieving robustness remains a significant challenge even in narrow domains like Go. We test three approaches to defend Go AIs from adversarial strategies. We find these defenses protect against previously discovered adversaries, but uncover qualitatively new adversaries that undermine these defenses.

Our research portfolio covers three core agendas.

Robustness

Will superhuman AI systems be reliable and secure?

Model Evaluation

How can we tell if advanced AI systems are trustworthy and secure?

Interpretability

How can we understand an AI’s internals?