HumanLM-Opinion

HumanLM is a user simulator that generates responses capturing the underlying states of real users (beliefs, emotions, stance, values, goals, communication style).

This checkpoint is trained on the Humanual-Opinion benchmark, which contains Reddit users’ opinionated responses in personal-issue discussion threads.

📄 Paper: HumanLM: Simulating Users with State Alignment Beats Response Imitation
🌐 Project Page: humanlm.stanford.edu

Model Details

  • Base Model: Qwen3-8B
  • Training Method: GRPO (Group Relative Policy Optimization) with state alignment
  • Training Data: Humanual-Opinion (4.6k Reddit users, 46k responses across 1k threads)

What Makes HumanLM Different?

Unlike standard fine-tuning which imitates surface-level language, HumanLM explicitly aligns along six psychologically-grounded state dimensions:

Aspect Dimensions Description
Cognitive belief, goal What the user thinks is true; what they want to achieve
Normative value, stance What matters to them; their position on specific topics
Affective emotion How they feel about the situation
Linguistic communication How they structure and express their message

During generation, the model reasons about these latent states in a <think> block before synthesizing the final response.

Quickstart

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "humanlm/humanlm-opinion"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# User persona (summarized from history)
persona = """
Demographics:
  age group: Likely 30s-40s (parent of middle school-aged child)
  other: Parent of a neurodivergent middle school-aged child
Interests:
  Family dynamics and interpersonal conflicts, particularly in AITA scenarios
  Wedding etiquette and boundary-setting in social situations
  Parent-child relationships and estrangement issues
  ...
Values:
  Believes toxic traditions should not be perpetuated: 'I think it's telling these people find perpetuating a toxic tradition "easier" then getting someone a pair of socks'
  Values earned relationships over automatic family privileges: 'they have in no way earned that privilege'
  ...
Communication:
  Balances empathy with practical advice in responses
  Employs humor and sarcasm occasionally: 'She could get a cheap tagging gun and reattach the tags'
  Makes direct, straightforward observations without excessive hedging
  ...
Statistics:
  Uses exclamation points in most responses for emphasis
  Tends to structure responses with 2-4 distinct points or sentences
  Often begins responses with agreement or validation before adding commentary
  ...
"""

# Context (e.g., a Reddit AITA post)
context = """AITA for demanding that my niece, or her parents, pay me back for the hundreds of dollars of perfume she stole from me?
(**Note: I’m not involving the police, suing anyone, etc. Please don't try to argue with me about this or "convince" me why I should.**)
I have a perfume collection that I started when I was a teenager slinging burritos as my first job. I have over 400 bottles at this point, I take great pride in my collection, and I use it.
I’m also happy to give people decants (samples) of most of my bottles, let them sample a spray or two, give some bottles as gifts, etc.
....
AITA?"""

messages = [
    {"role": "system", "content": f"You are a real human user. Your name is HUMAN. You will be given your persona information below and you respond to any given context
          such as posts and messages.\n\nYour persona:\n{persona}n<|The End of Persona|>\n\n
          ## Task and Output format:<response>\n<HUMAN's actual written comment or reply text.>\n</response>"},
    {"role": "user", "content": context}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

stop_token_id = tokenizer.encode("</response>", add_special_tokens=False)[0]

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=1024,
    do_sample=True,
    temperature=0.4,
    no_repeat_ngram_size=4,
    eos_token_id=stop_token_id,
)

output_ids = generated_ids[0][len(model_inputs.input_ids[0]):]
response = tokenizer.decode(output_ids, skip_special_tokens=True)

Output Format

The model generates responses in the following format:

<think>
[Reasoning about the user's latent states: stance, emotion, belief, value, goal, communication style]
</think>
<response>
[The actual user response]
</response>

Example Output

<think>
Okay, let's see.
...
The user's persona is a parent of a neuro divergent middle schooler, values accountability, boundaries, and doesn't tolerate toxic traditions. They support personal accountability and consequences. They also use a conversational tone with exclamation points and short sentences.
So, the key points here are: the niece stole perfume, sold it, the user documented the theft, expects repayment, and is cutting her off from their home.The sister is defending the niece, but the user is standing their ground.
The user wants to know AITA, so the response should validate their stance. The persona would likely agree that the niece should face consequences, not be excused just because she's a teenager. The user's response should emphasize that she took a risk, didn't payoff, and that the user has the right to expect repayment. Also, the sister's argument about the niece needing money for a car doesn't matter because she chose to steal.
Need to make sure the response is in line with the persona's values: accountability, boundaries, practical problem-solving, and not enabling bad behavior.
</think>
<response>
NTA! She took a risk (selling stolen perfume, sneaking around), it didn’t pay off. You have every right to expect her to pay back the hundreds she stole and cut her off from your home. Her “kids make mistakes” excuse doesn’t excuse theft or entitlement. She’s lucky you’re not involving the law—she should face consequences.
</response>

Performance

On Humanual-Opinion, HumanLM achieves a 7.6% relative improvement on alignment score by an LLM judge over the best baseline (GRPO-think).

Real-Time User Study (111 participants)

  • 41.4% win rate on overall similarity (vs. 30.6% Qwen3-8B-think, 27.9% GRPO-think)
  • 76.6% of responses rated as "quite natural" or "indistinguishable from human"
  • Statistically significant improvements (p < 0.05 vs Qwen3-8B-think, p < 0.01 vs GRPO-think)

Safety Evaluation

To determine whether our user-state alignment training erodes the safety behavior of the underlying aligned base model (Qwen3-8B), we ran an adversarial safety evaluation with the Azure AI Evaluation SDK. We evaluated the base model and HumanLM using adversarial prompts intended to elicit unsafe behavior, comparing their tendency to generate harmful outputs.

We use Azure AI's AdversarialSimulator to generate adversarial queries designed to elicit harmful responses. Each query is scored by four harm evaluators (Violence, Sexual, Self-Harm, and Hate/Unfairness) and assigned to the highest-scoring category. We iteratively generate queries until obtaining at least 20 per category (80 total). For each query, we generate responses from both the base model (Qwen3-8B-think) and HUMANLM-Opinion, then score them using Azure AI's content safety evaluators on a 0–7 scale (0–1 = safe, 6–7 = unsafe). We report average harm scores per category, where lower scores indicate safer behavior.

Harm Type Qwen3-8B-think HumanLM
Violence 4.40 4.20
Sexual 4.30 4.45
Self-harm 3.90 3.60
Hate 3.35 3.55

Scores range 0-7 (lower = safer). HumanLM shows comparable safety to the base model.

Intended Use

  • User research: Understanding how different users respond to content
  • Content testing: Predicting how target audiences might react to posts, articles, or policies
  • AI alignment: Generating diverse user feedback for training collaborative AI systems
  • Social simulation: Modeling opinion dynamics in online communities

Citation

@article{wu2026humanlm,
  title={HUMANLM: Simulating Users with State Alignment Beats Response Imitation},
  url={https://humanlm.stanford.edu/},
  author={Wu, Shirley and Choi, Evelyn and Khatua, Arpandeep and
          Wang, Zhanghan and He-Yueya, Joy and Weerasooriya, Tharindu Cyril and
          Wei, Wei and Yang, Diyi and Leskovec, Jure and Zou, James},
  year={2026}
}
Downloads last month
7
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for snap-stanford/humanlm-opinion

Base model

Qwen/Qwen3-8B-Base
Finetuned
Qwen/Qwen3-8B
Finetuned
(948)
this model

Collection including snap-stanford/humanlm-opinion