CogSci 2025

August 02, 2025

San Francisco, United States

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

keywords:

intelligent agents

theory of mind

decision making

artificial intelligence

machine learning

Deep reinforcement learning has achieved remarkable success in complex decision-making tasks, yet its black-box nature limits practical deployment in safety-critical domains. Current explainable reinforcement learning methods often fail to align with the hierarchical and temporal structure of human mental models, which are central to cognitive science theories of decision making. To bridge this gap, we propose Mental Model Alignment (MMA), a novel framework that constructs cognitive interfaces using behavior trees to harmonize AI decision-making with human-understandable reasoning. MMA introduces three innovations: (1) a mental model encoder that captures the hierarchical decomposition of tasks into subgoals, mirroring human cognitive processes; (2) a cognitive pruning algorithm that simplifies BTs while preserving decision-critical nodes aligned with human mental schemas; and (3) a mental effort metric to quantify the cognitive load required for users to interpret policies. Evaluated across six benchmark environments, MMA outperforms state-of-the-art methods in interpretability, policy fidelity, and computational efficiency. Our results demonstrate that aligning AI policies with human mental models significantly enhances trust and usability in real-world applications.

Downloads

SlidesPaperTranscript English (automatic)

Next from CogSci 2025

The effect of gender on multimodal child-directed language: Evidence from analyses of broadcast programmes
technical paper

The effect of gender on multimodal child-directed language: Evidence from analyses of broadcast programmes

CogSci 2025

+1Yan Gu
Yanran Zhang and 3 other authors

02 August 2025