LLMs & Artificial Intelligence
Mr. Anirudh Doppalapudi
When deep learning models are trained on new tasks, there is a sharp drop in performance on previously learned tasks, a phenomenon termed catastrophic forgetting. The present study uses Representational Similarity Analysis(RSA) to examine how the internal representations of networks change when they undergo catastrophic forgetting. We ask the question: do models completely reconfigure their internal representations of older tasks after learning a new task? We trained a network on a dataset of 3 categories consisting of rocks from Nosofsky et al. (2018). We divided images in each category into two halves and trained a network inspired by AlexNet on the first half of images (Task A), followed by the second half of images (Task B). When catastrophic forgetting occurred in our paradigm, we compared the internal representations of Task A before and after the network learned Task B. While later layers of the network showed a low RSA, the early layers of the network showed a high RSA suggesting that the model retained most of its geometry of its internal representations during this paradigm. We found that our results were robust across training datasets (we replicated these results for the MNIST dataset), across model architectures, and across task complexities. These results challenge the assumption that neural networks forget all internal representations during catastrophic forgetting and pave the way for not only solving this problem for the network, but also provide insight into human memory.
This is an in-person presentation on July 19, 2026 (09:00 ~ 09:20 EDT).
In many decision contexts, algorithmic recommendations are rejected in favor of human judgment, even when algorithms achieve higher predictive accuracy - a pattern commonly referred to as algorithm aversion. From a normative perspective, this is theoretically puzzling: superior performance should increase reliance, yet algorithm use depends on additional conditions. Existing accounts often emphasize irrational distortions, attitudes, or trust-related concerns toward algorithmic systems. However, this perspective leaves open whether algorithm aversion can emerge as a consistent outcome of subjective decision making when multiple psychological influences jointly shape expectations and utility evaluations underlying individual choice. The present work addresses this issue by developing a formal decision-theoretic model that specifies individual choice between human judgment and algorithmic forecasts in the context of managerial sales and cost forecasting. Choice is modeled as an expected-utility comparison in which subjective evaluations, perceived costs, and related influences determine option valuation. These factors are reflected in utility both directly and via their impact on subjective expectations. The model demonstrates that algorithm aversion can arise from the interaction of subjective evaluations and constraints in individual choice behavior. Simulations and comparative statics reveal nonlinear and threshold patterns under which performance advantages fail to translate into algorithmic choice. In particular, the model explains why further improvements in specific influencing factors produce diminishing or non-monotonic effects on algorithm acceptance. Seemingly counterintuitive empirical findings can thus be reconciled within a unified formal framework.
This is an in-person presentation on July 19, 2026 (09:20 ~ 09:40 EDT).
James Jennings
Prof. Clintin Davis-Stober
As AI systems increasingly act as autonomous agents making sequential decisions, they must possess stable utility representations to ensure safe human-AI alignment. However, current evaluations prioritize ground-truth accuracy over structural rationality. Fundamental choice axioms, such as transitivity of preferences, provide testable conditions for determining whether a decision-making process is structurally rational. In order to move beyond standard performance benchmarks and evaluate underlying decision-making consistency, this study systematically varies: 1) model architecture, 2) question format, 3) generation temperature, and 4) contextual memory across 20 distinct LLMs. The experimental design is structured into two investigative paths: Path 1 observes the effects of stochasticity, systematically measuring how increasing an LLM's temperature influences its adherence to transitivity. Path 2 focuses on contextual memory, evaluating whether providing LLMs with a memory of their own prior choices impacts their overall rationality. To capture these nuances, choice outputs were evaluated against several established models of transitivity. Evaluations reveal distinct behavioral patterns: for Path 1, while increased temperature amplified the probability of the model selecting a specific choice, it did not significantly alter adherence to the evaluated transitivity models. Conversely, for Path 2, providing models with sequential memory increased their adherence to strictly constrained models of transitivity, but decreased adherence to less constrained models. We propose that utilizing transitivity axioms in AI assessments provides a definitive benchmark for output quality, while laying the theoretical foundation for true computational alignment and a fundamental understanding of artificial decision-making.
This is an in-person presentation on July 19, 2026 (09:40 ~ 10:00 EDT).
Kiwon Song
James Jennings
Konstantina Sokratous
Prof. Clintin Davis-Stober
Large Language Models (LLMs) are increasingly deployed in high-stakes decision-making contexts with moral implications, highlighting the need to develop effective, theory-driven frameworks to evaluate their responses. Namely, it is critical to assess the structure and generalizability of AI moral judgments. We propose adopting methods used in human decision-making research to evaluate comparative moral judgments with respect to the transitivity axiom from fundamental measurement theory. With this approach, we have a basis for comparing AI moral choices with those documented in humans. Using validated vignettes derived from Moral Foundations Theory (MFT), we evaluated various LLMs on trade-offs among seven moral principles: authority, emotional care, fairness, liberty, loyalty, physical care, and sanctity. To comprehensively assess whether the method of eliciting LLM outputs affects response behavior, we analyzed both log probabilities of first-token responses and full-text outputs. We tested the resulting rank-orderings of moral principles against three order-constrained probabilistic choice models. Through this analysis, we model permissible choice representations (e.g., weak utility, general random utility) of varying restrictiveness. We compare the best-fitting model with results from a similar human experiment. Findings provide theory-driven insight into whether LLMs exhibit transitive moral preferences over MFT principles in their decision-making and whether decisions align with established theories of moral principle orderings in humans. Future work should investigate the influence of demographic and contextual factors on AI moral decision-making.
This is an in-person presentation on July 19, 2026 (10:00 ~ 10:20 EDT).
Submitting author
Author