By using this site, you consent to our use of cookies. You can view our terms and conditions for more information.
We present a model of the dynamics of category learning tasks using the Coupled Hidden Markov Model (CHMM) framework (Villarreal and Lee, 2024). The key innovation of the CHMM approach is the assumption that participants can update the category assignment of all stimuli, including those not currently presented, on a trial-by-trial basis. CHMMs have the ability to adapt their predictions about category assignment based on future observations, which makes them difficult to evaluate. To address this problem we demonstrate two approaches for evaluating a CHMM by comparing its predictions to the Generalized Context Model of categorization (GCM, Nosofsky, 1988). The first approach uses leave “n” out cross validation with data from a category learning experiment reported by Navarro et al. (2005) in which participants classify pictures of faces to one of two categories. The second approach uses a generalization test based on a learning-transfer categorization task with simple shape stimuli reported by Bartlema et al. (2014). Our results show that the predictions of the CHMM model are at least as accurate as predictions of the GCM. These findings suggest that the ability of the CHMM approach to accurately account for data from category learning tasks is not a consequence of its added flexibility.
Instance (or exemplar) models of memory and inference have been used to explain data from numerous experiments, though they have been criticized for lacking a broader theory of conceptual knowledge. Recently, we showed that instance models can be implemented as the update equation of a class of attractor networks, where varying the amount of competition during retrieval allows the networks to flexibly retrieve both individual items and the means of clusters from the same memories. In this work, we show that the same networks can recover hierarchical category structures, such as those seen in real world semantic categories. We first consider an artificial hierarchical dataset, finding that a variety of instance-based networks, including Hopfield networks, the Brain-in-a-Box model to model, the MINERVA 2 architecture, modern approaches using lateral inhibition (similar to SUSTAIN and the Adaptive Representation Model), and continuous-valued Modern Hopfield Networks can each recover hierarchical structures under ideal data conditions. Critically, given an item as a retrieval cue, prototypes of each hierarchical level can be retrieved using a simple attentional mechanism, providing a potential route to deliberately control the information which is retrieved. We then examined more realistic memory representations by storing noisy, pretrained GLoVE, Word2Vec, and BERT embeddings, as well as embeddings obtained from human feature norms (McRae et al., 2005) into each architecture. Overall, models with lateral inhibition and nonlinear competitive dynamics can retrieve hierarchical representations with GLoVE, Word2Vec, and feature norm embeddings, while BERT embeddings possess less hierarchical information for the categories we consider.