Papers
arxiv:2512.05858

Prompting Science Report 4: Playing Pretend: Expert Personas Don't Improve Factual Accuracy

Published on Dec 5, 2025
Authors:
,
,
,
,
,

Abstract

Assigning expert or low-knowledge personas to models did not consistently improve performance on challenging multiple-choice questions, with some cases showing degraded accuracy.

AI-generated summary

This is the fourth in a series of short reports that help business, education, and policy leaders understand the technical details of working with AI through rigorous testing. Here, we ask whether assigning personas to models improves performance on difficult objective multiple-choice questions. We study both domain-specific expert personas and low-knowledge personas, evaluating six models on GPQA Diamond (Rein et al. 2024) and MMLU-Pro (Wang et al. 2024), graduate-level questions spanning science, engineering, and law. We tested three approaches: -In-Domain Experts: Assigning the model an expert persona ("you are a physics expert") matched to the problem type (physics problems) had no significant impact on performance (with the exception of the Gemini 2.0 Flash model). -Off-Domain Experts (Domain-Mismatched): Assigning the model an expert persona ("you are a physics expert") not matched to the problem type (law problems) resulted in marginal differences. -Low-Knowledge Personas: We assigned the model negative capability personas (layperson, young child, toddler), which were generally harmful to benchmark accuracy. Across both benchmarks, persona prompts generally did not improve accuracy relative to a no-persona baseline. Expert personas showed no consistent benefit across models, with few exceptions. Domain-mismatched expert personas sometimes degraded performance. Low-knowledge personas often reduced accuracy. These results are about the accuracy of answers only; personas may serve other purposes (such as altering the tone of outputs), beyond improving factual performance.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2512.05858 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2512.05858 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2512.05858 in a Space README.md to link it from this page.

Collections including this paper 1