Matthew Kowal

Research Resident

Matthew Kowal is a research resident at FAR.AI working on persuasion capabilities in LLMs. He specializes in interpreting the internal representations of multimodal models.

Matthew is a PhD candidate at York University, Toronto. His thesis focused on designing multi-layer interpretability methods with an emphasis on applications to understanding how spatiotemporal models process concepts across space and time. His previous research focused on interpreting the representations of CNNs with respect to specific concepts, such as measuring shape vs texture and position information.

NEWs & publications

Frontier LLMs Attempt to Persuade into Harmful Topics

August 21, 2025

attempt-to-persuade-eval

It's the Thought that Counts: Evaluating the Attempts of Frontier LLMs to Persuade on Harmful Topics

its-the-thought-that-counts-evaluating-the-attempts-of-frontier-llms-to-persuade-on-harmful-topics

It's the Thought that Counts: Evaluating the Attempts of Frontier LLMs to Persuade on Harmful Topics

June 3, 2025

its-the-thought-that-counts-evaluating-the-attempts-of-frontier-llms-to-persuade-on-harmful-topics

Frontier LLMs Attempt to Persuade into Harmful Topics

attempt-to-persuade-eval

Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment

February 6, 2025

universal-sparse-autoencoders-interpretable-cross-model-concept-alignment

publications:

Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment

February 6, 2025

Research

Our research explores a portfolio
of high-potential agendas.

Events

Our events bring together
global leaders in AI.

Programs

Our programs build the field of trustworthy and secure AI