Israelbliz commited on
Commit
07c68ca
Β·
verified Β·
1 Parent(s): 1539e17

Upload app and README

Browse files
Files changed (2) hide show
  1. README.md +127 -0
  2. app.py +538 -0
README.md ADDED
@@ -0,0 +1,127 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: User Modeling Agent
3
+ emoji: πŸ“
4
+ colorFrom: green
5
+ colorTo: red
6
+ sdk: docker
7
+ app_port: 7860
8
+ pinned: false
9
+ ---
10
+
11
+ # User Modeling Agent
12
+
13
+ **DSN Γ— BCT LLM Agent Challenge 2026 β€” Task A.**
14
+
15
+ An agent that reads a person into a behavioural *persona*, then writes the
16
+ star rating and the review that person would leave for an unseen product β€”
17
+ and critiques and revises its own draft before returning it.
18
+
19
+ > Live demo: *(your HuggingFace Space URL)*
20
+ > Code: *(your GitHub repo URL)*
21
+
22
+ ---
23
+
24
+ ## What it does
25
+
26
+ Given a **user persona** and **product details**, the agent produces:
27
+
28
+ - a **star rating** (1–5) the user would likely give, and
29
+ - a **written review** in that user's voice β€” tone, length, and quirks matched.
30
+
31
+ It is not a generic review generator. Every output is conditioned on a
32
+ specific reader, and the rating is reasoned, not guessed.
33
+
34
+ ## The agentic workflow
35
+
36
+ The system is an agent, not a single prompt. It runs a five-step loop:
37
+
38
+ 1. **Build the persona.** A `PersonaEngine` extracts a structured persona β€”
39
+ quantitative signals (average rating, rating spread, review length,
40
+ domains, rating distribution) and qualitative voice (tone, preferred
41
+ themes, common complaints, a one-line voice descriptor) distilled by an
42
+ LLM from sample reviews. In the deployed app the persona can also be
43
+ *composed directly* from typed input β€” the brief's persona-as-input
44
+ contract.
45
+
46
+ 2. **Select grounding history.** For a real user, the agent picks the few
47
+ past reviews most similar to the target item, so it writes from concrete
48
+ evidence of how this person actually phrases things.
49
+
50
+ 3. **Generate the rating and review.** A single LLM call, with the rating
51
+ reasoned in two explicit steps β€” first the persona *prior* (what this
52
+ user usually gives), then the *item evidence* (what the title and
53
+ description signal). The final rating is the prior adjusted by the
54
+ evidence, so a generous reviewer still rates a poor item low and a
55
+ critical reviewer still rates a strong item high.
56
+
57
+ 4. **Self-reflection β€” critique and revise.** A critic LLM audits the draft
58
+ for rating–text consistency, voice match, and on-topic fit. If it
59
+ objects, the agent rewrites with that feedback and re-checks β€” up to two
60
+ cycles. This act β†’ critique β†’ revise loop is what makes it an agent.
61
+
62
+ 5. **Post-process.** The rating is clamped to range. An optional Nigerian
63
+ Pidgin rendering layer can restyle the review while preserving meaning,
64
+ sentiment, and rating.
65
+
66
+ The agent degrades gracefully: if an LLM call fails, it falls back to a
67
+ deterministic persona rather than crashing.
68
+
69
+ ## How it maps to the Task A rubric
70
+
71
+ - **Review Text Quality** β€” reviews are grounded in the user's real past
72
+ reviews and self-critiqued for voice match.
73
+ - **Rating Accuracy** β€” the two-step prior-plus-evidence rating logic
74
+ corrects the common failure of predicting from the user average alone.
75
+ - **Behavioural Fidelity** β€” persona-conditioned generation; the persona
76
+ portrait is visible in the app for inspection.
77
+ - **Nigerian contextualization (bonus)** β€” a toggleable Nigerian Pidgin
78
+ rendering layer; off by default so scored output stays standard English.
79
+
80
+ ## Running locally
81
+
82
+ ```bash
83
+ pip install -r requirements.txt
84
+ # set your key in a .env file: LLM_PROVIDER=gemini and GEMINI_API_KEY=...
85
+ streamlit run app.py
86
+ ```
87
+
88
+ The processed data (`data/processed/*.parquet`) must be present.
89
+
90
+ A FastAPI service is also available:
91
+
92
+ ```bash
93
+ uvicorn task_a_user_modeling.main:app --reload
94
+ ```
95
+
96
+ ## Project layout
97
+
98
+ ```
99
+ core/ shared engine β€” config, llm, persona, reflection, nigerian
100
+ task_a_user_modeling/ the Impersonation agent + FastAPI service
101
+ scripts/ test harness (test_task_a.py)
102
+ data/processed/ Amazon Reviews 2023 β€” Books Β· Movies & TV Β· Kindle Store
103
+ app.py Streamlit demo
104
+ ```
105
+
106
+ ## Configuration
107
+
108
+ Set in a `.env` file (never commit it):
109
+
110
+ - `LLM_PROVIDER` β€” `gemini` or `openai`
111
+ - `GEMINI_API_KEY` / `OPENAI_API_KEY`
112
+
113
+ On a HuggingFace Space, set these as **Secrets** in Space settings.
114
+
115
+ ## Notes and honest limitations
116
+
117
+ - The self-reflection critic checks internal consistency; it cannot catch a
118
+ rating that is wrong but self-consistent.
119
+ - Rating prediction on hard cases (a critical user who loved something) is
120
+ improved by the two-step logic but can still be ~0.5–1.0β˜… off.
121
+ - LLM output is non-deterministic; single-run results vary, so evaluation
122
+ averages across many users.
123
+
124
+ ## Credits
125
+
126
+ Built for the DSN Γ— BCT LLM Agent Challenge 2026.
127
+ Author: *(your name)*. Dataset: Amazon Reviews 2023.
app.py ADDED
@@ -0,0 +1,538 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """User Modeling Agent β€” the demo.
2
+
3
+ DSN Γ— BCT LLM Agent Challenge Β· Task A.
4
+
5
+ Takes a user persona and product details as input, and generates a star
6
+ rating and a written review as that user would write it β€” then critiques
7
+ and revises its own draft (self-reflection). Optionally renders the review
8
+ in Nigerian English.
9
+
10
+ Two ways to use it:
11
+ 1. Compose a persona β€” type a persona + product (the brief's input contract)
12
+ 2. Dataset reader β€” pick a real user, compare against ground truth
13
+
14
+ Run:
15
+ streamlit run app.py
16
+ """
17
+ from __future__ import annotations
18
+
19
+ import html
20
+ import sys
21
+ from pathlib import Path
22
+
23
+ ROOT = Path(__file__).resolve().parent
24
+ if str(ROOT) not in sys.path:
25
+ sys.path.insert(0, str(ROOT))
26
+
27
+ import pandas as pd
28
+ import streamlit as st
29
+
30
+ from core.config import settings
31
+ from core.persona import PersonaEngine, UserPersona
32
+ from task_a_user_modeling.agent import ImpersonationAgent, ItemInput
33
+
34
+ st.set_page_config(page_title="User Modeling Agent", page_icon="✢",
35
+ layout="wide", initial_sidebar_state="expanded")
36
+
37
+ esc = html.escape
38
+
39
+
40
+ # ══════════════════════════════════════════════════════════════════════════════
41
+ # Design system
42
+ # ══════════════════════════════════════════════════════════════════════════════
43
+
44
+ CSS = """
45
+ <style>
46
+ @import url('https://fonts.googleapis.com/css2?family=Fraunces:opsz,wght@9..144,400;9..144,500;9..144,600;9..144,900&family=Newsreader:ital,opsz,wght@0,6..72,400;0,6..72,500;0,6..72,600;1,6..72,400&family=Spline+Sans+Mono:wght@400;500;600&display=swap');
47
+
48
+ :root {
49
+ --paper:#f3ecdb; --paper-2:#fffdf6; --paper-3:#ece2cb;
50
+ --pine:#1d3a2b; --pine-2:#2c5440; --pine-ink:#14241b;
51
+ --clay:#b0472b; --ochre:#c98a3c; --gold:#d8a64a;
52
+ --ink:#221e16; --muted:#6f6651; --hair:#d4c8aa;
53
+ }
54
+ .stApp { background:var(--paper); color:var(--ink); }
55
+ .stApp::before {
56
+ content:""; position:fixed; inset:0; pointer-events:none; z-index:0;
57
+ background:
58
+ radial-gradient(900px 600px at 12% -5%, rgba(45,84,64,.10), transparent 60%),
59
+ radial-gradient(800px 600px at 95% 8%, rgba(176,71,43,.08), transparent 55%);
60
+ }
61
+ [data-testid="stMainBlockContainer"] { max-width:1140px; padding-top:2rem; padding-bottom:4rem; }
62
+ h1,h2,h3,h4 { font-family:'Fraunces',Georgia,serif !important; color:var(--pine) !important;
63
+ letter-spacing:-0.015em; font-weight:600 !important; }
64
+ html,body,p,div,span,label,li,.stMarkdown { font-family:'Newsreader',Georgia,serif; }
65
+ .stCaption,[data-testid="stCaptionContainer"] { font-family:'Spline Sans Mono',monospace !important; }
66
+
67
+ .masthead { position:relative; z-index:1; margin-bottom:0.3rem; }
68
+ .mast-rule { height:2px; background:var(--pine); margin-bottom:0.5rem; }
69
+ .mast-kicker { font-family:'Spline Sans Mono',monospace; font-size:0.70rem;
70
+ letter-spacing:0.30em; text-transform:uppercase; color:var(--clay); font-weight:600; }
71
+ .mast-title { font-family:'Fraunces',serif; font-weight:900;
72
+ font-size:clamp(2.3rem,5.5vw,3.7rem); line-height:1.0; color:var(--pine);
73
+ margin:0.16rem 0 0.1rem; letter-spacing:-0.03em; }
74
+ .mast-title .em { color:var(--clay); font-style:italic; font-weight:500; }
75
+ .mast-stand { font-family:'Newsreader',serif; font-size:1.08rem; color:#45402f;
76
+ max-width:66ch; line-height:1.45; }
77
+ .mast-stand em { color:var(--clay); font-style:italic; }
78
+ .mast-rule-bot { height:1px; background:var(--hair); margin:0.85rem 0 0.2rem; }
79
+
80
+ .sec-label { font-family:'Spline Sans Mono',monospace; font-size:0.70rem;
81
+ letter-spacing:0.2em; text-transform:uppercase; color:var(--clay);
82
+ font-weight:600; margin:0.3rem 0 0.15rem; }
83
+
84
+ .card { background:var(--paper-2); border:1px solid var(--hair); border-radius:3px;
85
+ padding:1.1rem 1.3rem; margin:0.5rem 0 0.85rem; position:relative; z-index:1; }
86
+ .card-kicker { font-family:'Spline Sans Mono',monospace; font-size:0.64rem;
87
+ letter-spacing:0.2em; text-transform:uppercase; color:var(--clay);
88
+ font-weight:600; margin-bottom:0.5rem; }
89
+
90
+ .persona-quote { font-family:'Fraunces',serif; font-weight:500; font-style:italic;
91
+ font-size:1.26rem; line-height:1.34; color:var(--pine); margin:0.1rem 0 0.8rem;
92
+ padding-left:0.8rem; border-left:3px solid var(--ochre); }
93
+ .pstats { display:flex; gap:1.7rem; flex-wrap:wrap; align-items:flex-end; }
94
+ .pstat .num { font-family:'Fraunces',serif; font-weight:900; font-size:1.5rem;
95
+ color:var(--pine); line-height:1; }
96
+ .pstat .lab { font-family:'Spline Sans Mono',monospace; font-size:0.60rem;
97
+ letter-spacing:0.13em; text-transform:uppercase; color:var(--muted); margin-top:0.2rem; }
98
+ .chips { margin-top:0.6rem; }
99
+ .chip-lab { font-family:'Spline Sans Mono',monospace; font-size:0.60rem;
100
+ letter-spacing:0.12em; text-transform:uppercase; color:var(--muted); margin-right:0.4rem; }
101
+ .chip { display:inline-block; margin:0.15rem 0.25rem 0.15rem 0; padding:0.15rem 0.6rem;
102
+ border-radius:999px; font-family:'Spline Sans Mono',monospace; font-size:0.72rem;
103
+ background:var(--paper-3); color:var(--pine-2); border:1px solid var(--hair); }
104
+ .chip.warn { background:#f0ddd2; color:var(--clay); border-color:#e3c4b4; }
105
+
106
+ .panel { background:var(--pine-ink); border-radius:3px; padding:1.35rem 1.55rem;
107
+ margin:0.5rem 0 0.85rem; position:relative; z-index:1;
108
+ box-shadow:0 14px 34px -22px rgba(20,36,27,.7); }
109
+ .panel .card-kicker { color:var(--gold); }
110
+ .rating-row { display:flex; align-items:center; gap:0.8rem; margin:0.25rem 0 0.65rem; }
111
+ .rating-chip { font-family:'Fraunces',serif; font-weight:900; font-size:1.6rem;
112
+ background:var(--clay); color:#fff7ec; padding:0.05rem 0.65rem; border-radius:3px; }
113
+ .stars { font-size:1.15rem; letter-spacing:0.1em; color:var(--gold); }
114
+ .review-body { font-family:'Newsreader',serif; font-size:1.1rem; line-height:1.7;
115
+ color:#f0e9d6; white-space:pre-wrap; }
116
+ .naija-badge { display:inline-block; margin-left:0.45rem; font-family:'Spline Sans Mono',monospace;
117
+ font-size:0.60rem; letter-spacing:0.12em; font-weight:600; background:#e9f0e2;
118
+ color:var(--pine); padding:0.12rem 0.5rem; border-radius:999px; border:1px solid #cdd9bf; }
119
+
120
+ .stepper { display:flex; gap:0; margin:0.3rem 0 0.2rem; flex-wrap:wrap; }
121
+ .step { flex:1; min-width:125px; padding:0.5rem 0.65rem; position:relative; }
122
+ .step .dot { width:11px; height:11px; border-radius:50%; background:var(--pine); margin-bottom:0.35rem; }
123
+ .step.flag .dot { background:var(--clay); }
124
+ .step.pass .dot { background:var(--pine-2); }
125
+ .step .st-name { font-family:'Fraunces',serif; font-weight:600; font-size:0.93rem;
126
+ color:var(--pine); line-height:1.1; }
127
+ .step .st-sub { font-family:'Spline Sans Mono',monospace; font-size:0.63rem;
128
+ color:var(--muted); margin-top:0.18rem; }
129
+ .step:not(:last-child)::after { content:""; position:absolute; top:0.87rem; right:-2px;
130
+ width:100%; height:1px;
131
+ background:repeating-linear-gradient(90deg,var(--hair) 0 6px,transparent 6px 12px); }
132
+ .critique-note { font-family:'Newsreader',serif; font-style:italic; font-size:0.93rem;
133
+ color:#5a4030; line-height:1.45; background:#f0ddd2; border-left:3px solid var(--clay);
134
+ padding:0.5rem 0.75rem; border-radius:2px; margin-top:0.45rem; }
135
+
136
+ .cmp { background:var(--paper-2); border:1px solid var(--hair); border-radius:3px;
137
+ padding:0.9rem 1.05rem; height:100%; }
138
+ .cmp.truth { border-top:3px solid var(--pine-2); }
139
+ .cmp.agent { border-top:3px solid var(--clay); }
140
+ .cmp-head { font-family:'Spline Sans Mono',monospace; font-size:0.62rem;
141
+ letter-spacing:0.15em; text-transform:uppercase; color:var(--muted); margin-bottom:0.35rem; }
142
+ .cmp-body { font-family:'Newsreader',serif; font-size:0.97rem; line-height:1.5;
143
+ color:#4a4434; white-space:pre-wrap; }
144
+ .delta { font-family:'Spline Sans Mono',monospace; font-size:0.70rem; font-weight:600;
145
+ padding:0.16rem 0.55rem; border-radius:999px; }
146
+ .delta.good { background:#e3ecd9; color:var(--pine); }
147
+ .delta.mid { background:#f3e6c8; color:#8a6420; }
148
+ .delta.far { background:#f0d8cc; color:var(--clay); }
149
+
150
+ .empty { border:1px dashed var(--hair); border-radius:3px; padding:1.5rem; text-align:center;
151
+ font-family:'Newsreader',serif; font-style:italic; color:var(--muted); font-size:1rem;
152
+ background:rgba(255,253,246,.5); }
153
+
154
+ @keyframes rise { from{opacity:0;transform:translateY(13px);} to{opacity:1;transform:translateY(0);} }
155
+ .reveal { animation:rise 0.55s cubic-bezier(.2,.7,.2,1) both; }
156
+ .d1{animation-delay:.04s;} .d2{animation-delay:.13s;} .d3{animation-delay:.22s;}
157
+
158
+ .stButton > button { background:var(--pine); color:var(--paper); border:none; border-radius:3px;
159
+ font-family:'Spline Sans Mono',monospace; font-weight:600; font-size:0.82rem;
160
+ letter-spacing:0.05em; padding:0.55rem 1rem; }
161
+ .stButton > button:hover { background:var(--clay); color:#fff7ec; }
162
+ [data-testid="stSidebar"] { background:var(--pine-ink); border-right:1px solid #2c4133; }
163
+ [data-testid="stSidebar"] * { color:#e7e0cd; }
164
+ [data-testid="stSidebar"] h1,[data-testid="stSidebar"] h2,[data-testid="stSidebar"] h3 { color:#f3ecdb !important; }
165
+ [data-baseweb="tab-list"] { gap:0.3rem; border-bottom:2px solid var(--pine); }
166
+ [data-baseweb="tab"] { font-family:'Fraunces',serif !important; font-weight:600;
167
+ font-size:1rem; color:var(--muted); }
168
+ [data-baseweb="tab"][aria-selected="true"] { color:var(--pine) !important; }
169
+ [data-baseweb="tab-highlight"] { background:var(--clay) !important; height:3px; }
170
+ .foot { margin-top:2.2rem; padding-top:0.85rem; border-top:1px solid var(--hair);
171
+ font-family:'Spline Sans Mono',monospace; font-size:0.68rem; color:var(--muted); line-height:1.6; }
172
+ </style>
173
+ """
174
+ st.markdown(CSS, unsafe_allow_html=True)
175
+
176
+
177
+ # ══════════════════════════════════════════════════════════════════════════════
178
+ # HTML builders
179
+ # ═══════════════���══════════════════════════════════════════════════════════════
180
+
181
+ def stars(r: float) -> str:
182
+ f = int(round(r))
183
+ return "β˜…" * f + "β˜†" * (5 - f)
184
+
185
+
186
+ def persona_card(p: UserPersona) -> str:
187
+ themes = "".join(f'<span class="chip">{esc(t)}</span>'
188
+ for t in p.preferred_themes) or '<span class="chip">β€”</span>'
189
+ comps = "".join(f'<span class="chip warn">{esc(t)}</span>'
190
+ for t in p.common_complaints) or '<span class="chip warn">β€”</span>'
191
+ nrev = (f'{p.n_reviews}' if p.n_reviews else 'composed')
192
+ return f"""
193
+ <div class="card reveal d1">
194
+ <div class="card-kicker">The Reader Β· persona</div>
195
+ <div class="persona-quote">β€œ{esc(p.voice_one_liner or 'No voice captured.')}”</div>
196
+ <div class="pstats">
197
+ <div class="pstat"><div class="num">{nrev}</div><div class="lab">history</div></div>
198
+ <div class="pstat"><div class="num">{p.avg_rating:.1f}β˜…</div><div class="lab">avg rating</div></div>
199
+ <div class="pstat"><div class="num">{esc(p.tone or 'β€”')}</div><div class="lab">tone</div></div>
200
+ </div>
201
+ <div class="chips"><span class="chip-lab">drawn to</span>{themes}</div>
202
+ <div class="chips"><span class="chip-lab">put off by</span>{comps}</div>
203
+ </div>"""
204
+
205
+
206
+ def reflection_stepper(iters: int, refined: bool, notes: list[str] | None) -> str:
207
+ steps = ['<div class="step pass"><div class="dot"></div>'
208
+ '<div class="st-name">First draft</div>'
209
+ '<div class="st-sub">generated in-voice</div></div>']
210
+ if refined:
211
+ steps += ['<div class="step flag"><div class="dot"></div>'
212
+ '<div class="st-name">Self-critique</div>'
213
+ '<div class="st-sub">found issues</div></div>',
214
+ '<div class="step pass"><div class="dot"></div>'
215
+ '<div class="st-name">Revised draft</div>'
216
+ '<div class="st-sub">rewritten with feedback</div></div>',
217
+ '<div class="step pass"><div class="dot"></div>'
218
+ '<div class="st-name">Re-checked</div>'
219
+ '<div class="st-sub">critique cleared</div></div>']
220
+ else:
221
+ steps += ['<div class="step pass"><div class="dot"></div>'
222
+ '<div class="st-name">Self-critique</div>'
223
+ '<div class="st-sub">passed first pass</div></div>',
224
+ '<div class="step pass"><div class="dot"></div>'
225
+ '<div class="st-name">Accepted</div>'
226
+ '<div class="st-sub">no revision needed</div></div>']
227
+ note = ""
228
+ if notes:
229
+ real = [n for n in notes if n and n.strip().lower() != "passed"]
230
+ if real:
231
+ note = f'<div class="critique-note">The critic flagged: {esc(real[0])}</div>'
232
+ return f"""
233
+ <div class="card reveal d3">
234
+ <div class="card-kicker">Self-reflection Β· {iters} critique cycle(s)</div>
235
+ <div class="stepper">{''.join(steps)}</div>
236
+ {note}
237
+ </div>"""
238
+
239
+
240
+ # ══════════════════════════════════════════════════════════════════════════════
241
+ # Cached resources
242
+ # ══════════════════════════════════════════════════════════════════════════════
243
+
244
+ @st.cache_data(show_spinner=False)
245
+ def load_data():
246
+ rev = pd.read_parquet(settings.processed_dir / "reviews.parquet")
247
+ items = pd.read_parquet(settings.processed_dir / "items.parquet")
248
+ return rev, items
249
+
250
+
251
+ @st.cache_resource(show_spinner=False)
252
+ def get_engines():
253
+ return PersonaEngine(), ImpersonationAgent()
254
+
255
+
256
+ def composed_persona(desc: str, themes: list[str], dislikes: list[str],
257
+ tone: str, avg_rating: float) -> UserPersona:
258
+ """Build a UserPersona from typed input β€” the brief's persona-as-input contract."""
259
+ # rating distribution skewed around the stated average
260
+ lo, hi = int(avg_rating), min(5, int(avg_rating) + 1)
261
+ dist = {lo: 0.55, hi: 0.35} if lo != hi else {lo: 0.9}
262
+ dist.setdefault(3, 0.1)
263
+ return UserPersona(
264
+ user_id="composed", n_reviews=0, avg_rating=avg_rating,
265
+ std_rating=0.6, avg_review_length=90.0, std_review_length=30.0,
266
+ verified_rate=1.0, domains=[], n_domains=0,
267
+ rating_distribution=dist, top_terms=[],
268
+ tone=tone, preferred_themes=themes, common_complaints=dislikes,
269
+ voice_one_liner=desc, history_samples=[],
270
+ )
271
+
272
+
273
+ # ══════════════════════════════════════════════════════════════════════════════
274
+ # Masthead
275
+ # ═════════��════════════════════════════════════════════════════════════════════
276
+
277
+ st.markdown("""
278
+ <div class="masthead">
279
+ <div class="mast-rule"></div>
280
+ <div class="mast-kicker">DSN Γ— BCT LLM Agent Challenge Β· Task A</div>
281
+ <div class="mast-title">User Modeling <span class="em">Agent</span></div>
282
+ <div class="mast-stand">
283
+ Give it a <em>user persona</em> and a <em>product</em>. It writes the star
284
+ rating and the review that user would write β€” weighing what they usually do
285
+ against what this specific item signals β€” then <em>critiques and revises</em>
286
+ its own draft before showing it.
287
+ </div>
288
+ <div class="mast-rule-bot"></div>
289
+ </div>
290
+ """, unsafe_allow_html=True)
291
+
292
+ try:
293
+ reviews, items = load_data()
294
+ except Exception as e:
295
+ st.error(f"Could not load data β€” ensure data/processed/*.parquet exist.\n\n{e}")
296
+ st.stop()
297
+
298
+ train = reviews[reviews["split"] == "train"]
299
+ test = reviews[reviews["split"] == "test"]
300
+ persona_engine, agent = get_engines()
301
+
302
+ with st.sidebar:
303
+ st.markdown("## ✢ Controls")
304
+ naija = st.toggle("πŸ‡³πŸ‡¬ Naija mode", value=False,
305
+ help="Render the review in Nigerian English. Meaning, "
306
+ "sentiment and rating are preserved β€” only voice shifts.")
307
+ st.caption("Naija mode ON β€” review in Nigerian English."
308
+ if naija else "Standard English output.")
309
+ st.divider()
310
+ st.markdown("### How it works")
311
+ st.caption("The agent builds a persona, drafts a review in that voice, then "
312
+ "runs a self-reflection loop β€” a critic LLM checks rating-text "
313
+ "consistency, voice match and on-topic fit, and the agent revises "
314
+ "if the critic objects.")
315
+ st.divider()
316
+ st.caption(f"Built by Israel")
317
+
318
+ st.session_state.setdefault("result", None)
319
+ st.session_state.setdefault("ctx", None)
320
+
321
+
322
+ # ══════════════════════════════════════════════════════════════════════════════
323
+ # Tabs β€” Compose (primary) Β· Dataset reader (secondary)
324
+ # ══════════════════════════════════════════════════════════════════════════════
325
+
326
+ tab_compose, tab_dataset = st.tabs(["✎ Compose a persona",
327
+ "⊞ Dataset reader"])
328
+
329
+ # ── COMPOSE ───────────────────────────────────────────────────────────────────
330
+ with tab_compose:
331
+ st.markdown('<div class="sec-label">Input Β· persona and product</div>',
332
+ unsafe_allow_html=True)
333
+ st.markdown("Describe a reader and a product. The agent will write the "
334
+ "review that reader would leave.")
335
+
336
+ cL, cR = st.columns(2)
337
+ with cL:
338
+ st.markdown("**The reader**")
339
+ p_desc = st.text_area(
340
+ "Describe the reader's reviewing voice",
341
+ value="A thoughtful reader who loves character-driven stories and "
342
+ "rich world-building, but is impatient with slow pacing.",
343
+ height=90, key="p_desc")
344
+ p_themes = st.text_input("Drawn to (comma-separated)",
345
+ value="character development, immersive worlds, "
346
+ "original plots", key="p_themes")
347
+ p_dislikes = st.text_input("Put off by (comma-separated)",
348
+ value="slow pacing, thin characters", key="p_dis")
349
+ c1, c2 = st.columns(2)
350
+ with c1:
351
+ p_tone = st.selectbox("Tone", ["enthusiastic", "analytical", "casual",
352
+ "critical", "earnest", "terse"], key="p_tone")
353
+ with c2:
354
+ p_rating = st.slider("Typical rating", 1.0, 5.0, 4.0, 0.5, key="p_rate")
355
+ with cR:
356
+ st.markdown("**The product**")
357
+ i_title = st.text_input("Title",
358
+ value="The Midnight Library", key="i_title")
359
+ i_domain = st.selectbox("Domain", ["Books", "Movies_and_TV", "Kindle_Store"],
360
+ key="i_domain")
361
+ i_desc = st.text_area(
362
+ "Description / synopsis",
363
+ value="A novel about a library between life and death, where each "
364
+ "book lets a woman try a different version of her life.",
365
+ height=110, key="i_desc")
366
+
367
+ go = st.button("Generate review ✢", key="go_compose", use_container_width=True)
368
+
369
+ if go:
370
+ try:
371
+ with st.status("The agent is working…", expanded=True) as status:
372
+ themes = [t.strip() for t in p_themes.split(",") if t.strip()]
373
+ dislikes = [t.strip() for t in p_dislikes.split(",") if t.strip()]
374
+ st.write("Assembling the persona…")
375
+ persona = composed_persona(p_desc, themes, dislikes, p_tone, p_rating)
376
+ item = ItemInput(parent_asin="composed", title=i_title,
377
+ description=i_desc, categories="",
378
+ domain=i_domain)
379
+ st.write("Drafting in the reader's voice, then self-critiquing…")
380
+ result = agent.run(persona, item, naija_mode=naija)
381
+ st.write("Self-reflection complete")
382
+ status.update(label="Review generated", state="complete")
383
+ st.session_state.result = result
384
+ st.session_state.ctx = {"persona": persona, "item": item, "truth": None}
385
+ except Exception as e:
386
+ st.session_state.result = None
387
+ st.markdown(f'<div class="card" style="border-left:3px solid var(--clay)">'
388
+ f'<div class="card-kicker">Generation interrupted</div>'
389
+ f'The model call did not complete β€” it may be rate-limited. '
390
+ f'Try again shortly.<br><span style="font-family:Spline Sans Mono,'
391
+ f'monospace;font-size:0.72rem;color:#6f6651">'
392
+ f'{esc(type(e).__name__)}</span></div>', unsafe_allow_html=True)
393
+
394
+ # ── DATASET READER ────────────────────────────────────────────────────────────
395
+ with tab_dataset:
396
+ st.markdown('<div class="sec-label">Input Β· a real reader from the data</div>',
397
+ unsafe_allow_html=True)
398
+ st.markdown("Pick a reader. The agent builds their persona from real history "
399
+ "and writes a review of a held-out item β€” compared to what they "
400
+ "actually wrote.")
401
+
402
+ elig = train.groupby("user_id").size().reset_index(name="n")
403
+ elig = elig[(elig["n"] >= 5) & (elig["user_id"].isin(set(test["user_id"])))]
404
+ users = elig.sample(min(40, len(elig)), random_state=7)["user_id"].tolist()
405
+
406
+ cc1, cc2 = st.columns([3, 1])
407
+ with cc1:
408
+ user = st.selectbox("Reader", users, key="sel_user",
409
+ label_visibility="collapsed")
410
+ with cc2:
411
+ go_ds = st.button("Generate ✢", key="go_ds", use_container_width=True)
412
+
413
+ if go_ds and user:
414
+ try:
415
+ with st.status("The agent is working…", expanded=True) as status:
416
+ ut = test[test["user_id"] == user]
417
+ if ut.empty:
418
+ status.update(label="No held-out item for this reader",
419
+ state="error")
420
+ st.stop()
421
+ tr = ut.iloc[0]
422
+ tid = tr["parent_asin"]
423
+ meta = items[items["parent_asin"] == tid]
424
+ if meta.empty:
425
+ item = ItemInput(parent_asin=tid, title=str(tr.get("title", "")),
426
+ description="", categories="", domain=tr["domain"])
427
+ else:
428
+ m = meta.iloc[0]
429
+ item = ItemInput(parent_asin=tid, title=str(m.get("title", "")),
430
+ description=str(m.get("description", ""))[:1500],
431
+ categories=str(m.get("categories", "")),
432
+ domain=tr["domain"],
433
+ average_rating=(float(m["average_rating"])
434
+ if pd.notna(m.get("average_rating"))
435
+ else None))
436
+ st.write("Reading the reader's history…")
437
+ persona = persona_engine.from_dataframe(user, train)
438
+ persona = persona_engine.enrich(persona)
439
+ st.write(f"Persona built from {persona.n_reviews} reviews")
440
+ st.write("Drafting in their voice, then self-critiquing…")
441
+ result = agent.run(persona, item, naija_mode=naija)
442
+ st.write("Self-reflection complete")
443
+ status.update(label="Review generated", state="complete")
444
+ st.session_state.result = result
445
+ st.session_state.ctx = {"persona": persona, "item": item,
446
+ "truth": {"rating": float(tr["rating"]),
447
+ "text": str(tr["text"])}}
448
+ except Exception as e:
449
+ st.session_state.result = None
450
+ st.markdown(f'<div class="card" style="border-left:3px solid var(--clay)">'
451
+ f'<div class="card-kicker">Generation interrupted</div>'
452
+ f'The model call did not complete β€” it may be rate-limited. '
453
+ f'Try again shortly.<br><span style="font-family:Spline Sans Mono,'
454
+ f'monospace;font-size:0.72rem;color:#6f6651">'
455
+ f'{esc(type(e).__name__)}</span></div>', unsafe_allow_html=True)
456
+
457
+
458
+ # ══════════════════════════════════════════════════════════════════════════════
459
+ # Result β€” shown below both tabs
460
+ # ══════════════════════════════════════════════════════════════════════════════
461
+
462
+ res = st.session_state.result
463
+ ctx = st.session_state.ctx
464
+ st.markdown("---")
465
+ if res and ctx:
466
+ st.markdown(persona_card(ctx["persona"]), unsafe_allow_html=True)
467
+
468
+ it = ctx["item"]
469
+ st.markdown(f"""
470
+ <div class="card reveal d2">
471
+ <div class="card-kicker">The Item</div>
472
+ <span style="font-family:Spline Sans Mono,monospace;font-size:0.6rem;
473
+ letter-spacing:0.13em;text-transform:uppercase;color:var(--pine-2)">
474
+ {esc(it.domain)}</span>
475
+ <div style="font-family:Fraunces,serif;font-weight:600;font-size:1.14rem;
476
+ color:var(--ink);margin-top:0.1rem">{esc(it.title)}</div>
477
+ </div>""", unsafe_allow_html=True)
478
+
479
+ badge = '<span class="naija-badge">NAIJA VOICE</span>' if res.naija_mode else ""
480
+ st.markdown(f"""
481
+ <div class="panel reveal d3">
482
+ <div class="card-kicker">The Generated Review Β· written as the reader</div>
483
+ <div class="rating-row">
484
+ <span class="rating-chip">{res.rating:.1f}</span>
485
+ <span class="stars">{stars(res.rating)}</span>{badge}
486
+ </div>
487
+ <div class="review-body">{esc(res.review)}</div>
488
+ </div>""", unsafe_allow_html=True)
489
+
490
+ st.markdown(reflection_stepper(res.reflection_iterations,
491
+ res.reflection_refined,
492
+ res.reflection_notes), unsafe_allow_html=True)
493
+
494
+ st.markdown('<div class="sec-label">Why this rating</div>', unsafe_allow_html=True)
495
+ truth = ctx.get("truth")
496
+ if truth:
497
+ col1, col2 = st.columns(2)
498
+ with col1:
499
+ st.markdown(f"""
500
+ <div class="cmp agent reveal d1">
501
+ <div class="cmp-head">The agent rated it {res.rating:.1f}β˜…</div>
502
+ <div class="cmp-body">{esc(res.reasoning)}</div>
503
+ </div>""", unsafe_allow_html=True)
504
+ with col2:
505
+ d = abs(res.rating - truth["rating"])
506
+ dc = "good" if d <= 0.5 else ("mid" if d <= 1.0 else "far")
507
+ t = truth["text"].replace("<br />", "\n").replace("<br>", "\n")
508
+ t = t[:520] + ("…" if len(t) > 520 else "")
509
+ st.markdown(f"""
510
+ <div class="cmp truth reveal d2">
511
+ <div class="cmp-head">The reader actually wrote &nbsp;
512
+ <span class="delta {dc}">Ξ” {d:.1f}β˜…</span></div>
513
+ <div style="margin:0.15rem 0 0.35rem">
514
+ <span class="stars" style="color:var(--pine-2)">{stars(truth['rating'])}</span>
515
+ <span style="font-family:Spline Sans Mono,monospace;font-size:0.74rem;
516
+ color:#6f6651"> {truth['rating']:.1f}β˜…</span></div>
517
+ <div class="cmp-body">{esc(t)}</div>
518
+ </div>""", unsafe_allow_html=True)
519
+ else:
520
+ st.markdown(f"""
521
+ <div class="cmp agent reveal d1">
522
+ <div class="cmp-head">The agent rated it {res.rating:.1f}β˜…</div>
523
+ <div class="cmp-body">{esc(res.reasoning)}</div>
524
+ </div>""", unsafe_allow_html=True)
525
+ st.caption(f"grounded on {res.used_history_count} similar past reviews")
526
+ else:
527
+ st.markdown('<div class="empty">Compose a persona and a product, or pick a '
528
+ 'dataset reader β€” then press <b>Generate</b>. The agent writes '
529
+ 'the review in that reader\'s voice and shows its reasoning.</div>',
530
+ unsafe_allow_html=True)
531
+
532
+ st.markdown("""
533
+ <div class="foot">
534
+ User Modeling Agent Β· DSN Γ— BCT LLM Agent Challenge 2026 Β·
535
+ persona β†’ draft in-voice β†’ self-reflection critique &amp; revise Β·
536
+ rating predicted as persona prior adjusted by item evidence
537
+ </div>
538
+ """, unsafe_allow_html=True)