"I've Seen How This Goes": Characterizing Diversity via Progressive Conditional Surprise
Abstract
A novel information-theoretic metric for measuring diversity in creative outputs using in-context learning without requiring specialized training or reference data.
Measuring the diversity of creative outputs is central to evaluating post-training mode collapse, comparing decoding strategies, and quantifying creative behavior in both AI and human writing. We propose a new approach to measuring diversity using in-context learning, of which the ``Decan'' metric, D_{Ca_n} = C times a_n, is the working instance we evaluate: a per-byte score read off the per-token log-probabilities of a base model θ in a single forward pass per permutation, with no embedding model, no reference corpus, and no human labels. This approach is grounded in information theory, makes use of language model in-context learning to detect a wide range of similarities between any number of inputs, and obviates the need to train a special-purpose model. The same pipeline scores AI samples and human-written response sets, with diversity treated as a property of (responses, prompt, scoring model). On Tevet and Berant's human-grounded McDiv benchmark, D_{Ca_n} reaches OCA 0.846 on the McDiv prompt\_gen set where it performs best, behind the strongest neural baseline reported in Tevet and Berant (SentBERT, 0.897). On the OLMo-2-7B post-training pipeline, D_{Ca_n} drops monotonically across the base to SFT to DPO to RLVR stages, detecting the type of diversity loss that creative-writing applications care about.
Get this paper in your agent:
hf papers read 2606.01811 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper