-
Reasoning Under Pressure: How do Training Incentives Influence Chain-of-Thought Monitorability?
Matt MacDermott*, Qiyao Wei*, Rada Djoneva*, Francis Rhys Ward*
arXiv preprint, 2025
-
Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?
Yoshua Bengio, Michael Cohen, ... Matt MacDermott, ... et al.
arXiv preprint, 2025
-
Can a Bayesian Oracle Prevent Harm from an Agent?
Yoshua Bengio, Michael K. Cohen, Nikolay Malkin, Matt MacDermott, Damiano Fornasiere, Pietro Greiner, Younesse Kaddar
UAI 2025
-
Measuring Goal-Directedness
Matt MacDermott, James Fox, Francesco Belardinelli, Tom Everitt (2024)
NeurIPS 2024
Spotlight Paper
-
The Reasons that Agents Act: Intention and Instrumental Goals
Francis Rhys Ward, Matt MacDermott, Francesco Belardinelli, Francesca Toni, Tom Everitt
AAMAS 2024
-
Discovering Agents
Zachary Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan Richens, Matt MacDermott, Tom Everitt
Artificial Intelligence, 2023
-
On Imperfect Recall in Multi-Agent Influence Diagrams
James Fox, Matt MacDermott, Lewis Hammond, Paul Harrenstein, Alessandro Abate, Michael Wooldridge
TARK 2023
Best Paper Award
-
Characterising Decision Theories with Mechanised Causal Graphs
Matt MacDermott, Tom Everitt, Francesco Belardinelli
Games, Agents and Incentives Workshop; AAMAS 2023