Israelbliz commited on
Commit
1d7aa33
·
verified ·
1 Parent(s): 79bb546

Delete README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -127
README.md DELETED
@@ -1,127 +0,0 @@
1
- ---
2
- title: User Modeling Agent
3
- emoji: 📝
4
- colorFrom: green
5
- colorTo: red
6
- sdk: docker
7
- app_port: 7860
8
- pinned: false
9
- ---
10
-
11
- # User Modeling Agent
12
-
13
- **DSN × BCT LLM Agent Challenge 2026 — Task A.**
14
-
15
- An agent that reads a person into a behavioural *persona*, then writes the
16
- star rating and the review that person would leave for an unseen product —
17
- and critiques and revises its own draft before returning it.
18
-
19
- > Live demo: *(your HuggingFace Space URL)*
20
- > Code: *(your GitHub repo URL)*
21
-
22
- ---
23
-
24
- ## What it does
25
-
26
- Given a **user persona** and **product details**, the agent produces:
27
-
28
- - a **star rating** (1–5) the user would likely give, and
29
- - a **written review** in that user's voice — tone, length, and quirks matched.
30
-
31
- It is not a generic review generator. Every output is conditioned on a
32
- specific reader, and the rating is reasoned, not guessed.
33
-
34
- ## The agentic workflow
35
-
36
- The system is an agent, not a single prompt. It runs a five-step loop:
37
-
38
- 1. **Build the persona.** A `PersonaEngine` extracts a structured persona —
39
- quantitative signals (average rating, rating spread, review length,
40
- domains, rating distribution) and qualitative voice (tone, preferred
41
- themes, common complaints, a one-line voice descriptor) distilled by an
42
- LLM from sample reviews. In the deployed app the persona can also be
43
- *composed directly* from typed input — the brief's persona-as-input
44
- contract.
45
-
46
- 2. **Select grounding history.** For a real user, the agent picks the few
47
- past reviews most similar to the target item, so it writes from concrete
48
- evidence of how this person actually phrases things.
49
-
50
- 3. **Generate the rating and review.** A single LLM call, with the rating
51
- reasoned in two explicit steps — first the persona *prior* (what this
52
- user usually gives), then the *item evidence* (what the title and
53
- description signal). The final rating is the prior adjusted by the
54
- evidence, so a generous reviewer still rates a poor item low and a
55
- critical reviewer still rates a strong item high.
56
-
57
- 4. **Self-reflection — critique and revise.** A critic LLM audits the draft
58
- for rating–text consistency, voice match, and on-topic fit. If it
59
- objects, the agent rewrites with that feedback and re-checks — up to two
60
- cycles. This act → critique → revise loop is what makes it an agent.
61
-
62
- 5. **Post-process.** The rating is clamped to range. An optional Nigerian
63
- Pidgin rendering layer can restyle the review while preserving meaning,
64
- sentiment, and rating.
65
-
66
- The agent degrades gracefully: if an LLM call fails, it falls back to a
67
- deterministic persona rather than crashing.
68
-
69
- ## How it maps to the Task A rubric
70
-
71
- - **Review Text Quality** — reviews are grounded in the user's real past
72
- reviews and self-critiqued for voice match.
73
- - **Rating Accuracy** — the two-step prior-plus-evidence rating logic
74
- corrects the common failure of predicting from the user average alone.
75
- - **Behavioural Fidelity** — persona-conditioned generation; the persona
76
- portrait is visible in the app for inspection.
77
- - **Nigerian contextualization (bonus)** — a toggleable Nigerian Pidgin
78
- rendering layer; off by default so scored output stays standard English.
79
-
80
- ## Running locally
81
-
82
- ```bash
83
- pip install -r requirements.txt
84
- # set your key in a .env file: LLM_PROVIDER=gemini and GEMINI_API_KEY=...
85
- streamlit run app.py
86
- ```
87
-
88
- The processed data (`data/processed/*.parquet`) must be present.
89
-
90
- A FastAPI service is also available:
91
-
92
- ```bash
93
- uvicorn task_a_user_modeling.main:app --reload
94
- ```
95
-
96
- ## Project layout
97
-
98
- ```
99
- core/ shared engine — config, llm, persona, reflection, nigerian
100
- task_a_user_modeling/ the Impersonation agent + FastAPI service
101
- scripts/ test harness (test_task_a.py)
102
- data/processed/ Amazon Reviews 2023 — Books · Movies & TV · Kindle Store
103
- app.py Streamlit demo
104
- ```
105
-
106
- ## Configuration
107
-
108
- Set in a `.env` file (never commit it):
109
-
110
- - `LLM_PROVIDER` — `gemini` or `openai`
111
- - `GEMINI_API_KEY` / `OPENAI_API_KEY`
112
-
113
- On a HuggingFace Space, set these as **Secrets** in Space settings.
114
-
115
- ## Notes and honest limitations
116
-
117
- - The self-reflection critic checks internal consistency; it cannot catch a
118
- rating that is wrong but self-consistent.
119
- - Rating prediction on hard cases (a critical user who loved something) is
120
- improved by the two-step logic but can still be ~0.5–1.0★ off.
121
- - LLM output is non-deterministic; single-run results vary, so evaluation
122
- averages across many users.
123
-
124
- ## Credits
125
-
126
- Built for the DSN × BCT LLM Agent Challenge 2026.
127
- Author: *(your name)*. Dataset: Amazon Reviews 2023.