United States

The Turing test typically evaluates machine intelligence by asking whether a human judge can distinguish between human and AI conversational behavior. But the test also serves as an evaluation of the judge, upon whose discriminative capabilities the merit of the test depends. We investigate this dependency by replicating two variations of the Turing test: (1) a displaced test, where human participants judge transcripts of previously conducted interrogations, and (2) an inverted test, where AI systems make similar judgments. Comparing these with traditional interactive tests, we find that displaced judges perform similarly to interactive judges, and LLM judges perform significantly worse than humans. This challenges assumptions about the importance of real-time interaction, and suggests that accuracy is not significantly impacted by displacement, but may be impacted by differences in a judge’s model of human vs. AI behavior. Our results have implications for societal risks of AI, as systems that can consistently deceive both interactive and passive observers could enable large-scale online impersonation and manipulation.

CogSci 2025

Judging the Judges: Displacing and Inverting the Turing test to Investigate the Interrogator

interactive behavior

social cognition

theory of mind

human-computer interaction

psychology

artificial intelligence

poster

### Welcome to CogSci Conference 2025!

The 47th Annual Meeting of the Cognitive Science Society was a hybrid meeting held in San Francisco. 

<div style="position:relative;padding-top:0;width:900px;height:500px;"><iframe style="position:absolute;border:none;width:100%;height:100%;left:0;top:0;" src="https://online.fliphtml5.com/ebtyf/amvr/"  seamless="seamless" scrolling="no" frameborder="0" allowtransparency="true" allowfullscreen="true" ></iframe></div>

#### About

The Cognitive Science Society brings together researchers from around the world who hold a common goal: understanding the nature of the human mind. The mission of the Society is to promote Cognitive Science as a discipline, and to foster scientific interchange among researchers in various areas of study, including Artificial Intelligence, Linguistics, Anthropology, Psychology, Neuroscience, Philosophy, and Education.

The Society is a non-profit professional organization and its activities include sponsoring an annual conference and publishing the journals Cognitive Science and TopiCS.

#### Our History 

* **Society Creation**<br>
The Society was incorporated as a 501(c)(3) non-profit professional organization in Massachusetts in 1979. The organizing committee included Roger Schank, Allan Collins, Donald Norman, and a number of other scholars from psychology, linguistics, computer science, and philosophy. 
<br><br>
* **Conference Creation**<br>
The first conference on cognitive science was held at La Jolla, California in August, 1979, and has occurred annually since then. The proceedings of each conference are published, and those from most years are available through Lawrence Erlbaum Associates, Inc. The annual proceedings of the Cognitive Science Conference represent a major source of information on new work and new ideas in the scientific study of thinking. In 1990, the Society, with help from an anonymous donor, established the David Marr Prize for the best student paper at each annual meeting.
<br><br>
* **Journal Creation**<br>
The Journal, Cognitive Science, began publication in 1976, and is now published by Wiley-Blackwell. The Executive Editor is currently Richard P. Cooper of Birkbeck, University of London, and there are 18 Associate Editors and a 30-member editorial board. It serves as the premier outlet for research reports that intersect two or more disciplines. Copyrights for articles published in the journal are held by the Society. The Governing Board of the Cognitive Science Society voted in late 2006 to found a new journal, Topics in Cognitive Science (topiCS). The Editor in Chief is Wayne Gray, Cognitive Science Department, Rensselaer Polytechnic Institute. The journal seeks to fill a niche not occupied by Cognitive Science Journal or other cognitive science journals. Membership in the Society includes a subscription to Cognitive Science and TopiCS. Copyrights for articles published in the journal are held by the Society.
<br><br>

#### Code of Conduct

By attending the CogSci 2025 Conference, you are required to adhere to the society’s **[Code of Conduct](https://drive.google.com/file/d/1ChPuihLy6jE_BWqfO7J2KKgX35JW2zsM/view?usp=sharing)**.
<br><br>


You need to log in with the email address you registered with. 

Login credentials were sent to you from Underline -  subject line "Welcome to the CogSci 2025 Conference". Please be sure to check your spam/promotional inbox  if you do not see an email confirmation right away.





Please log in to join this event.

To access the site, please register [**here**](https://cognitivesciencesociety.org/registration/).

If you are registered and feel like you are seeing this message by mistake, please make sure you are logged in with the same email that you registered with. 

Please register!

The 47th Annual Meeting of the Cognitive Science Society presents the latest research across cognitive science and highlights the theme of Cognition in Context.

Existing theories of word learning largely focus on a learner's ability to learn a single meaning for a word despite the fact that many words have multiple meanings. Several computational models of cross-situational word learning have been proposed to explain how words are learned, but it is unknown to what extent they can learn ambiguous words with multiple meanings. Here, we present an experiment showing that adult learners are able to learn multiple meanings of novel ambiguous words in a cross-situational word learning paradigm, and are especially good at doing so when the meanings of the words are related (polysemous) rather than unrelated (homophonous). We evaluated the ability of ten different computational models of cross-situational word learning to explain the empirical data, and none were able to learn the ambiguous words as successfully as the adult learners. Moreover, because these computational models do not represent any semantic information, they are in principle unable to replicate the key difference between polysemous and homophonous word learning found in the study.

How well do models of cross-situational word learning account for the learning of ambiguous words?

Humans across cultures distinguish intimate, close, or family-like relationships from those that are merely affiliative. Recent work suggests that this distinction is so fundamental that even humans as young as 8 months recognize a common cue of social intimacy: close physical contact. In the current studies, we investigate whether children, ages 6 to 9 years, recognize another hallmark of intimate relationships: emotional intimacy. In Study 1, children used the disclosure of sad emotions, as opposed to facts or happy emotions, as a cue for close social relationships. Interestingly, adults thought that disclosing emotions more generally was indicative of closer relationships. In Study 2, children expected that people in close social relationships would more often disclose sad emotions, but not happy emotions or facts. Again, adults did not distinguish between happy and sad emotions: they thought people in closer relationships would disclose both happy and sad emotions rather than facts. In Study 3, neither children or adults thought that disclosing sad emotions was a way to create social relationships. Together, these results suggest that by the age of six years, children connect close social relationships with emotional intimacy, but that they don’t use it in their planning.

Children's Expectations of Emotional Intimacy in Close Relationships

Effects of bilingualism on cognitive control remain highly debated. Such debates partly stem from reliance on behavioral measures alone, which may obscure subtle individual differences. Even studies that leverage brain electrophysiology report mixed results, often due to categorizing individuals as monolingual or bilingual. Here, we examined whether the degree of bilingualism was related to the P3b effect—an established electrophysiological measure of cognitive control. Young adults with heterogeneous language experiences completed the Language Social Background Questionnaire (Anderson et al., 2018). Electroencephalography data were recorded from 70 participants who completed the Active Visual Oddball paradigm (Kappenman et al., 2021), a task optimized to isolate the P3b response. We found that more bilingual language experience was associated with larger P3b effects, even in the absence of behavioral differences. These results highlight the importance of characterizing bilingualism along a continuum when investigating bilingual effects on cognitive processing.

Degree of bilingualism and cognitive neural processing in adults

In the present study, we tested whether inducing people to reflect on their knowledge may increase their intellectual humility. We hypothesized that asking participants to answer knowledge tests would prompt them to recalibrate their perception of their own knowledge, thereby fostering intellectual humility. Study 1 demonstrated a significant increase in intellectual humility following the intervention, whereas Study 2 replicated and extended these findings in a larger sample, confirming the effect despite its small magnitude. The observed increase may be due to the activation of analytical reasoning style or to the acknowledgement of one’s knowledge limitations. However, further research is needed to corroborate these conjectures and explore the long-term effects of interventions to enhance intellectual humility.

Can reasoning make you humble? Experimental tests to improve intellectual humility

Blink synchronization may reflect a person’s internal states. An interested listener unconsciously synchronizes their blinks to a speaker; however, it is unclear if this is related to their comprehension of what is being said. This study examined the association of blink synchronization of a listener to a speaker based on the listener’s interest, comprehension, and empathy. Participants viewed a video clip explaining a research topic as a simulation of actual communication that requires comprehension, and their blinking patterns were measured using web cameras. Aspects of the internal states were assessed using a questionnaire. The findings revealed that participants with high comprehension synchronized their blinks to the speaker with a delay of a few hundred milliseconds. Additionally, levels of comprehension were significantly correlated with blink synchronization among the listeners. Therefore, blink synchronization may reflect comprehension levels in situations that require comprehension. Communication can be improved by utilizing these insights.

Relationship between comprehension and blink synchronization

Infants' object play is closely tied to language learning and 3D object understanding.
How does this kind of embodied visual experience support infants' abilities to categorize objects ``in the wild?'' We address this question using domain adaptation, a machine learning framework for studying distribution shifts, i.e., how to transfer knowledge learned from one data distribution to a related but different distribution. We formalize a specific distribution shift problem inspired by infant visual learning---the VI-Shift problem---which mimics the tradeoff between object instances and viewpoints in these two regimes of visual experience. We study the VI-Shift problem through the lens of domain adaptation in deep learning architectures, in particular using novel metrics to demonstrate how the clusterability of learned features contributes to robust generalization.  We show that two classic domain adaptation methods do not perform well on the VI-Shift problem, and we demonstrate a novel loss function that improves performance by leveraging some of the distinctive visual characteristics of embodied object play experiences.  Our results illustrate one potential learning route through which the distinctive visual properties of embodied object experience can boost robust generalization.

How Infant-Like, Embodied Visual Experiences Can Support Generalization to In-The-Wild Images: Insights from Domain Adaptation

Human working memory (WM) is central to our complex cognitive capacities. Famously, it is limited, and much debate surrounds the nature of this limitation. Anatomical evidence reveals strongly-connected neural populations in PFC allowing robust maintenance. Past work implicates a basal ganglia circuit in WM management. There remain open questions about whether, aside from anatomical ‘hard’ limits, there are computational ‘soft’ limits that arise from learning/management bottlenecks. Here, we use computational modeling to tease apart these factors by considering them in isolation: we allow a transformer model trained to do a symbolic WM management task full access to past context, manipulating only the number of concurrent symbols it needs to learn to maintain, controlling for surface complexity. We find that despite having no ‘hard’ limits, the model shows difficulty in learning that scales with the computational demand, suggesting WM limits in humans may have arisen due to a learning bottleneck.

Learning imposes a bottleneck beyond anatomical constraints: a computational investigation into the nature of WM capacity limits

Behavioral adaptation in probabilistic environments requires learning through trial and error. While reinforcement learning (RL) models can describe the temporal development of preferences through error-driven learning, the diffusion decision model (DDM) allow for the mapping of state preferences on single response times. We present a Bayesian hierarchical RL-DDM integrating temporal-difference (TD) learning. Our implementation incorporates variants of TD learning, including SARSA, Q-Learning, and Actor-Critic models. We tested the model with data from N = 59 participants in a two-stage decision-making task. Participants exhibited learning over time, becoming both more accurate and faster. They also reflected a difficulty effect, with faster and more accurate responses for easier choices, as reflected by greater subjective value differences between available options. Model comparison demonstrated that the RL-DDM provided a better fit compared to standalone RL or DDM models. Notably, the RL-DDM captured both the temporal dynamics of learning and the difficulty effect in decision-making.

Temporal-Difference Learning in Uncertain Choice: A Reinforcement Learning-Diffusion Decision Model of Two-Stage Decision-Making

When people acquire a second language (L2), do they benefit from analogical reasoning?  Past research showed that people are likely to engage in analogical reasoning to support their L2 learning, yet it is unclear whether this process is explicit or only occurs automatically and implicitly. In the current study, English-speaking participants (N = 100) learned a miniature artificial language with grammatical markers that were either morphologically congruent or incongruent with English grammar. We then assessed participants’ acquisition of the artificial language and their explicit use of analogical reasoning. Acquisition was improved when the artificial language was structurally congruent with English, and was also better with participants who reported explicit analogical reasoning. This was especially pronounced for the ability to generate novel content. These findings provide evidence that learners acquiring a new language spontaneously leverage analogies with their existing languages, and this is especially beneficial when the analogies are recognized explicitly.

Does Explicit Analogical Reasoning Help Second Language Acquisition? Evidence From Artificial Language Learning

We use a computer vision model to examine inequities in the quality of videos shown to children from different socioeconomic backgrounds. This work is foundational to understanding the origin of divergence in children’s cognitive development. We use our model to quantify visual salience across three categories of children’s media: ad supported, paid, and educational public television. We find that ad-supported media contains significantly higher levels of flicker, a feature of visual salience linked to disrupted processing and worse learning outcomes (Essex et al., 2022; Shepherd & Kidd, 2024). These results represent the first to quantitatively demonstrate a difference in the quality of media shown to low- versus high-income children. These findings confirm that children from low-income families are watching more visually salient content and thus more at risk for the potential harms this type of content poses.

Downloads

Next from CogSci 2025

How well do models of cross-situational word learning account for the learning of ambiguous words?