Matthew Kowal

Research Resident

Matthew Kowal is a research resident at FAR.AI working on persuasion capabilities in LLMs. He specializes in interpreting the internal representations of multimodal models.

Matthew is a PhD candidate at York University, Toronto. His thesis focused on designing multi-layer interpretability methods with an emphasis on applications to understanding how spatiotemporal models process concepts across space and time. His previous research focused on interpreting the representations of CNNs with respect to specific concepts, such as measuring shape vs texture and position information.

NEWs & publications

Frontier LLMs Attempt to Persuade into Harmful Topics
August 21, 2025
attempt-to-persuade-eval
It's the Thought that Counts: Evaluating the Attempts of Frontier LLMs to Persuade on Harmful Topics
its-the-thought-that-counts-evaluating-the-attempts-of-frontier-llms-to-persuade-on-harmful-topics
It's the Thought that Counts: Evaluating the Attempts of Frontier LLMs to Persuade on Harmful Topics
June 3, 2025
its-the-thought-that-counts-evaluating-the-attempts-of-frontier-llms-to-persuade-on-harmful-topics
Frontier LLMs Attempt to Persuade into Harmful Topics
attempt-to-persuade-eval
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
February 6, 2025
universal-sparse-autoencoders-interpretable-cross-model-concept-alignment