CogSci 2025

August 02, 2025

San Francisco, United States

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

keywords:

interactive behavior

social cognition

theory of mind

human-computer interaction

psychology

artificial intelligence

The Turing test typically evaluates machine intelligence by asking whether a human judge can distinguish between human and AI conversational behavior. But the test also serves as an evaluation of the judge, upon whose discriminative capabilities the merit of the test depends. We investigate this dependency by replicating two variations of the Turing test: (1) a displaced test, where human participants judge transcripts of previously conducted interrogations, and (2) an inverted test, where AI systems make similar judgments. Comparing these with traditional interactive tests, we find that displaced judges perform similarly to interactive judges, and LLM judges perform significantly worse than humans. This challenges assumptions about the importance of real-time interaction, and suggests that accuracy is not significantly impacted by displacement, but may be impacted by differences in a judge’s model of human vs. AI behavior. Our results have implications for societal risks of AI, as systems that can consistently deceive both interactive and passive observers could enable large-scale online impersonation and manipulation.

Downloads

PaperTranscript English (automatic)

Next from CogSci 2025

How well do models of cross-situational word learning account for the learning of ambiguous words?
poster

How well do models of cross-situational word learning account for the learning of ambiguous words?

CogSci 2025

Mahesh Srinivasan
Sophie Regan and 2 other authors

02 August 2025