Bayesian Methods
Prof. Andrew Heathcote
Prof. Birte Forstmann
Dr. Dora Matzke
Studying individual differences in psychology often involves examining correlations across various measures. However, research involving high-dimensional data—such as in task batteries or neuroscience—often targets latent constructs rather than individual correlations. Furthermore, the number of correlations grows quadratically with increasing dimensionality, potentially leading to overfitting and spurious inference. Therefore, researchers commonly use factor analysis to study individual differences. However, conventional approaches ignore the hierarchical structure of the data and overlook measurement error, leading to attenuated factor loadings. We introduce a Bayesian framework that integrates hierarchical modeling to account for measurement error with factor analysis to infer latent structures. The framework employs novel techniques in the field of Bayesian factor analysis to facilitate model comparison and reduce a priori constraints. The accompanying software enables the creation of generative models at the individual level, supporting a wide range of hypotheses—from descriptive to theory-driven models—and facilitating robust group-level inferences grounded in psychological theory. Through simulations and empirical applications, we demonstrate that our hierarchical factor analysis method flexibly and reliably estimates latent structures in high-dimensional data, offering a valuable tool for individual- differences research in psychology and neuroscience.
This is an in-person presentation on July 27, 2025 (09:00 ~ 09:20 EDT).
Dan Barch Jr.
This talk presents the rationale and methods for doing distribution-free Bayesian statistical analyses with the Barch and Chechile (2023) DFBA R package. The DFBA package provides tools for conducting Bayesian counterparts to several common frequentist nonparametric methods - including the Mann-Whitney U test, the Wilcoxon Signed-Rank test, the Kendall Tau correlation, and more - as well as functions to assist with experimental planning and distribution-free model selection. We also discuss some new software functions that are in development; these functions provide Bayesian methods for doing a survival analysis in a biomedical context and address the generalization of bivariate statistical association to include multiple regression and stepwise regression. Examples of all the new statistical procedures are discussed, and comparisons are drawn about how these methods improve upon existing frequentist analyses.
This is an in-person presentation on July 27, 2025 (09:20 ~ 09:40 EDT).
Michael Lee
Two major open questions in the study of reasoning are (1) what functions people compute to draw conclusions from given pieces of information, or premises; and (2) how people interpret the meanings of the premises they draw conclusions from. For example, how justified is it to conclude "they travelled by train" on the basis that "If they went to Ohio, then they travelled by train" and "they went to Ohio", and why? Although these questions have been debated for thousands of years, it is typically difficult to distinguish competing theories empirically because they tend to be defined only verbally, not computationally; and because they usually overlap in the predictions they make. This talk presents the current state of an ongoing project in which we translate verbal theories of how people reason with and interpret conditional premises like "If they went to Ohio, then they travelled by train" into computational form. Building on the hypothesis that people try not to contradict themselves when reasoning, we derive sets of internally consistent conclusions for a range of inferences and premise interpretations, and formalize them as components of a Bayesian latent-mixture model. Applying the model to simulated and existing reasoning datasets, we illustrate how different combinations of inferences provide more or less information for distinguishing between competing theories based on the specificity and degree of overlap in their predictions.
This is an in-person presentation on July 27, 2025 (10:00 ~ 10:20 EDT).
Joachim Vandekerckhove
Bayesian inference provides a principled framework for modeling cognitive and psychometric data, but scalability remains a challenge. Traditional MCMC methods become computationally impractical when working with very large datasets. In these scenarios, MCMC methods often require extended periods of continuous computing and in many cases result in defective, non-convergent chains. Divide-and-conquer (DC) methods offer a scalable alternative by partitioning data into disjoint subsets, performing computations separately on each, and then recombining the “subposterior” MCMC samples to approximate the full posterior distribution. Crucially, the recombination strategy in use directly affects posterior accuracy and predictive performance. Here, we evaluate the performance of different recombination strategies across full and partitioned datasets in cognitive and psychometric models, including Item Response Theory models. Comparing DC recombination against full-data inference allows us to explore the degree in which different strategies accurately track the target posterior. Using this approach, we systematically compare trade-offs between computing costs and posterior accuracy, highlighting the conditions under which different recombination rules succeed or fail. Our analysis considers factors such as the size of the dataset and the type of model, providing insights into the practical limitations of DC methods for scalable Bayesian inference.
This is an in-person presentation on July 27, 2025 (09:40 ~ 10:00 EDT).
Submitting author
Author