Are LLM Belief Updates Consistent with Bayes’ Theorem?

Published in Under review @AAAI, Accepted @ICML 2025 Workshop on Assessing World Models, 2025

Abstract:

Do larger and more capable language models learn to update their “beliefs” about propositions more consistently with Bayes’ theorem when given new evidence in context? To test this, we formulate a Bayesian Coherence Error (BCE) metric and generate a dataset with which to measure BCE. We measure BCE for multiple pre-trained-only language models across five model families, comparing against the number of model parameters, the number of training data, and model scores on common benchmarks. Our results imply the opposite of our hypothesis - larger and more capable pre-trained language models assign credences less coherent with Bayes’ theorem. We discuss potential explanations for, and implications of, our results.

Recommended citation: Sohaib Imran, Ihor Kendiukhov, Matthew Broerman, Aditya Thomas, Riccardo Campanella, Rob Lamb, Peter M. Atkinson.
Download Paper | Download Slides

Share on

Twitter Facebook LinkedIn