Human-AI Interaction
Prof. Mark Steyvers
Productive human-AI collaboration requires appropriate reliance, yet contemporary AI systems are often miscalibrated, exhibiting systematic overconfidence or underconfidence. We investigate whether humans can learn to mentally recalibrate AI confidence signals through repeated experience. In a behavioral experiment (N = 200), participants predicted the AI's correctness across four AI calibration conditions: standard, overconfidence, underconfidence, and a counterintuitive "reverse confidence" mapping. We develop a computational model utilizing a linear-in-log-odds (LLO) transformation and a Rescorla-Wagner learning rule to explain participants' trial-by-trial adaptation, and estimate the model using Bayesian multilevel inference to capture group-level trends and individual variability. Results demonstrate robust learning across all conditions, with participants significantly improving their accuracy, discrimination, and calibration alignment over 50 trials. The model reveals that humans adapt by updating their baseline trust and confidence sensitivity, using asymmetric learning rates to prioritize the most informative errors. While humans can compensate for monotonic miscalibration, we identify a significant boundary in the reverse confidence scenario, where a substantial proportion of participants struggled to override initial inductive biases. These findings provide a mechanistic account of how humans adapt their trust in AI confidence signals through experience.
This is an in-person presentation on July 18, 2026 (10:40 ~ 11:00 EDT).
Joachim Vandekerckhove
Prof. Clintin Davis-Stober
As artificial intelligence systems increasingly function as decision-making agents, their evaluation remains largely performance-based, emphasizing benchmark accuracy rather than measurement of underlying cognitive properties. In this talk, I argue for the development of an AI psychometrics: a formal measurement framework for artificial agents grounded in principles from representational measurement theory and computational cognitive modeling. Unlike human respondents, modern generative AI models offer multiple affordances: 1) they directly output probability distributions over response alternatives, 2) can be retested without contamination from learning or fatigue, and 3) permit structural intervention. I show how these affordances enable the use of computational cognitive models to quantify latent properties. Theoretical foundations will be outlined and presented along with empirical applications demonstrating how this approach reveals properties invisible to traditional benchmark metrics, serving thus as a stepping stone for the establishment of a principled measurement science for artificial cognition.
This is an in-person presentation on July 18, 2026 (11:20 ~ 11:40 EDT).
Prof. Mark Steyvers
Prior work on cognitive offloading to AI has largely focused on performance, often comparing accuracy when delegating to AI versus completing a task manually. Less is known about how people weigh completion time and effort when deciding whether to perform tasks themselves or delegate them to an AI, and the potential biases that shape these offloading choices. To formalize and measure these decisions, we consider human–AI delegation as a choice process under uncertainty and ask whether offloading tracks objective trade-offs in time and effort, or reflects a systematic preference for automation. We conducted behavioral experiments in which participants chose between delegating to an AI robot or completing the task themselves, with the two options differing in time and effort costs. Across two task blocks, the relative advantage alternated between completing the task themselves and delegating to an AI to test for adaptation to changing trade-offs. Model-based analyses showed that participants were sensitive to time and effort differences. However, their choices also demonstrate a systematic bias toward delegation that objective cost differences cannot entirely explain. These findings provide a basis for measuring delegation bias and individual differences, enabling generalizable predictions about offloading behavior and informing how AI-assisted workflows could optimize delegation rather than default to it.
This is an in-person presentation on July 18, 2026 (11:40 ~ 12:00 EDT).
Submitting author
Author