File size: 5,488 Bytes
6f2d08c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6dd2a0d
6f2d08c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
---
title: User Modeling Agent
emoji: πŸ“
colorFrom: green
colorTo: red
sdk: docker
app_port: 7860
pinned: false
---

# User Modeling Agent

**DSN Γ— BCT LLM Agent Challenge 2026 β€” Task A.**

An agent that reads a person into a behavioural *persona*, then writes the
star rating and the review that person would leave for an unseen product β€”
and critiques and revises its own draft before returning it.

> Live demo: https://huggingface.co/spaces/Israelbliz/User-Modeling-Agent

> Code: https://huggingface.co/spaces/Israelbliz/User-Modeling-Agent/tree/main

---

## What it does

Given a **person** and **product details**, the agent produces:

- a **star rating** (1–5) the person would likely give, and
- a **written review** in that person's voice β€” tone, length, and quirks matched.

It is not a generic review generator. Every output is conditioned on a
specific person, and the rating is reasoned, not guessed.

## Three input modes

The same persona engine is fed by three input modes:

- **Compose a persona** β€” describe the person's reviewing voice in free text.
- **Dataset reader** β€” a real user from the data; the agent is scored against
  a genuinely held-out review.
- **Build from past reviews** β€” paste a few of the person's actual past
  reviews, and the agent builds the persona from them.

## The agentic workflow

The system is an agent, not a single prompt. It runs a five-step loop:

1. **Build the persona.** A `PersonaEngine` extracts a structured persona β€”
   quantitative signals (average rating, rating spread, review length,
   domains, rating distribution) and a qualitative voice (tone, preferred
   themes, common complaints, a one-line voice descriptor) distilled by an
   LLM from sample reviews, with a deterministic fallback if that call fails.

2. **Select grounding history.** For a real person, the agent picks the few
   past reviews most similar to the target item, so it writes from concrete
   evidence of how this person actually phrases things.

3. **Generate the rating and review.** A single LLM call, with the rating
   reasoned in two explicit steps β€” first the persona *prior* (what this
   person usually gives), then the *item evidence* (what the title and
   description signal). The final rating is the prior adjusted by the
   evidence, so a generous reviewer still rates a poor item low and a
   critical reviewer still rates a strong item high.

4. **Self-reflection β€” critique and revise.** A critic LLM audits the draft
   for rating–text consistency, voice match, and on-topic fit. If it objects,
   the agent rewrites with that feedback and re-checks β€” up to two cycles.
   This act β†’ critique β†’ revise loop is what makes it an agent.

5. **Post-process.** The rating is clamped to range. An optional Nigerian
   Pidgin rendering layer can restyle the review while preserving meaning,
   sentiment, and rating.

## Reliability

- **Provider failover.** The agent runs a primary and a secondary LLM
  provider. If the primary fails β€” quota, rate limit or a transient service
  error β€” the same call is retried automatically on the secondary, so a live
  demo does not break when one provider is briefly unavailable.
- **Graceful degradation.** If an LLM call fails, the agent falls back to a
  deterministic persona rather than crashing.

## How it maps to the Task A rubric

- **Review Text Quality** β€” reviews are grounded in the person's real past
  reviews and self-critiqued for voice match.
- **Rating Accuracy** β€” the two-step prior-plus-evidence rating logic
  corrects the common failure of predicting from the user average alone.
- **Behavioural Fidelity** β€” persona-conditioned generation; the persona
  portrait is visible in the app for inspection.
- **Nigerian contextualization (bonus)** β€” a toggleable Nigerian Pidgin
  rendering layer; off by default so scored output stays standard English.

## Running locally

```bash
pip install -r requirements.txt
# set your keys in a .env file:
#   LLM_PROVIDER=openai
#   OPENAI_API_KEY=...
#   GEMINI_API_KEY=...
streamlit run app.py
```

`LLM_PROVIDER` sets the primary provider; the other provider, if its key is
present, is used as the automatic failover. The processed data
(`data/processed/*.parquet`) must be present.

## Project layout

```
core/                 shared engine β€” config, llm, persona, reflection, nigerian
task_a_user_modeling/ the User Modeling agent
scripts/              test harness (test_task_a.py)
data/processed/       Amazon Reviews 2023 β€” Books Β· Movies & TV Β· Kindle Store
app.py                Streamlit demo β€” three input modes
```

## Configuration

Set in a `.env` file (never commit it):

- `LLM_PROVIDER` β€” `openai` or `gemini` (the primary provider)
- `OPENAI_API_KEY` / `GEMINI_API_KEY` β€” both should be set so the unused one
  serves as the automatic failover

On a HuggingFace Space, set these as **Secrets** in Space settings.

## Notes and honest limitations

- The self-reflection critic checks internal consistency; it cannot catch a
  rating that is wrong but self-consistent.
- Rating prediction on hard cases (a critical user who loved something) is
  improved by the two-step logic but can still be ~0.5–1.0β˜… off.
- LLM output is non-deterministic; single-run results vary, so evaluation
  averages across many users.

## Credits

Built for the DSN Γ— BCT LLM Agent Challenge 2026.
Author: Israel Akomodesegbe. Team: Winning Team. Dataset: Amazon Reviews 2023.