Society for Mathematical Psychology

SMP 2025 Great Hall Meeting Room II

Reliable Measurement Symposium

Dr. Kenny Yu
Dr. Niels Vanhasbroeck

Psychological researchers apply quantitative models to discover the structure of their construct of interest, typically relying on the fit of the model to the data to reach their conclusions. It has been argued, however, that fit of the model is not sufficient to shield one from model misspecification, that is from inaccurately representing the underlying psychological process of interest. Yet it is unclear to which extent misspecified models can be estimated reliably, and whether the reliability with which parameters can be estimated may signal such misspecification. In this work, we simulate and estimate a whole range of correctly specified and misspecified models, going from the inclusion/exclusion of interaction effects to the aggregation across heterogeneous populations to a linear/nonlinear structure of the model. For each of these models, we then compute the reliability of the parameter estimation for the two types of models and compared the results. We predict that misspecified models may still yield parameter estimates with acceptable reliability metrics when evaluated in isolation. Additionally, we predict that when compared directly to correctly specified models, these same misspecified models will exhibit distinctive reliability degradation patterns reflecting compensatory mechanisms. This dual perspective highlights that while misspecified models might appear adequate when judged solely on standard reliability metrics, comparative analysis against well-specified alternatives can reveal systematic differences that signal model-process misalignment.

No recording available Join the discussion

Dr. Andrew Heathcote
Dr. Dora Matzke

Individual differences in human abilities are traditionally treated as having two components, fluctuating states and traits that are stable over time with only very slow developmental and aging changes. Both biological and psychological sources, such as circadian rhythms, sleep debt, affective states and learning and forgetting, cause fluctuations with scales ranging from seconds, minutes and hours to days, weeks and longer time periods. We investigate the implications of multiple scales of temporal variations in states for the measurement of human abilities. We show that multi-scale variation implies that both test-retest reliability and external validity can be improved, relative to traditional single-occasion testing, by briefer measurements on multiple occasions. We discuss how these results can be used to optimise the new ecological momentary assessment opportunities that are afforded by mobile measurement technologies.

No recording available Join the discussion

Presenting author
Submitting author
Author