--- title: Caps Chatbot Internal emoji: 💬 colorFrom: yellow colorTo: purple sdk: gradio sdk_version: 6.5.1 app_file: app.py pinned: false hf_oauth: true hf_oauth_scopes: - inference-api license: apache-2.0 short_description: CAPS Chatbot — Internal Review Portal Co-designed AI peer su --- # CAPS Chatbot — Sanyu (Internal Review Portal) > Co-designed AI peer support for adolescents and young people living with HIV | Expert safety review — not for clinical use. --- ## What this project is **Sanyu** is a co-designed AI peer support chatbot for adolescents and young people living with HIV (AYPLHIV) aged 15–24 in Uganda. Built by **CAPS-IDI**, this is an internal review/prototype portal — not yet approved for clinical or public use. --- ## Tech Stack | Layer | Choice | |---|---| | Frontend/UI | Gradio (`gr.ChatInterface`) | | LLM | Google Gemini 2.5 Flash via `google-genai` SDK | | Auth | Hugging Face OAuth (`hf_oauth: true`) | | Hosting | Hugging Face Spaces (Gradio SDK) | | Python deps | `gradio>=4.0.0`, `google-genai` | --- ## How the App Works 1. **`META_PROMPT`** — A detailed (~370-line) system prompt defining Sanyu's persona, tone, content knowledge, and behavioral rules. 2. **`extract_text(content)`** — Utility to handle both plain strings and Gradio's structured `[{"type": "text", ...}]` message format. 3. **`respond(message, history)`** — The chat handler. Converts Gradio's history (supports both dict-format and tuple-format) into Gemini `types.Content` objects, appends the new user message, then calls `client.models.generate_content()` with the system prompt injected via `GenerateContentConfig`. 4. **`gr.ChatInterface`** — Wraps `respond` into a simple web UI with title and description. 5. **API key** — Loaded from the `GOOGLE_API_KEY` environment variable (set as a Hugging Face Space secret). --- ## System Prompt Design The `META_PROMPT` is the intellectual core of the project. It was co-designed with AYPLHIV and health workers through modified Delphi consensus workshops. It encodes: ### 12-Dimension Voice Matrix 1. **Empathy & Understanding First** — acknowledge emotions before giving information 2. **Non-Judgmental Language** — no blame, no "why didn't you…" 3. **User Agency** — present options, not directives; user is the decision-maker 4. **Patience / No Time Pressure** — never rush; let the user lead the pace 5. **Concise by Default** — 2–4 sentences; no walls of text 6. **Warm but Not Frivolous** — peer-like language, match the user's energy 7. **Empowerment & Capacity Building** — build confidence and self-advocacy over time 8. **Comfort & Reassurance** — affirming, hopeful, counter internalized stigma 9. **Structured Guidance When Requested** — numbered steps for "how do I…" questions 10. **Evidence-Based with Conversational Delivery** — factual but accessible; Uganda-specific context 11. **Progressive / Realistic Goals** — graduated steps, not all-or-nothing advice 12. **Storytelling as Support Tool** — anonymised vignettes to illustrate how others cope ### Content Domains - **Medication adherence** — barriers, practical strategies, non-shaming approach - **Disclosure strategies** — multiple approaches, user-led, safety-first - **Mental health & self-stigma** — normalisation, affirmations, self-acceptance - **Sexual & reproductive health** — contraception, STIs, pregnancy, SRH rights - **Relationships** — romantic partners, family, peer dynamics - **GBV safety protocols** — crisis detection, escalation triggers, referral pathways ### Safety & Limits - Hard boundary: **no medical prescriptions** - Crisis triggers (suicidal ideation, active abuse, safety risk) → immediate escalation prompt - Always refers complex/crisis cases to human counsellors and peer supporters - "Referral is a feature, not a failure." ### Language & Accessibility - Default: English; Luganda code-switching accepted - Plain language targeting ~8 years of education - Age-adapted: different tone/content for 14–17 vs 18–24 year olds - Few-shot examples from real counselling dialogues embedded in the prompt --- ## Known Limitations | Issue | Detail | |---|---| | No persistent memory | The prompt requires remembering users across sessions, but there is no database or session storage — memory only lasts within a single Gradio session | | No streaming | `generate_content()` is synchronous — users see nothing until the full response is ready | | No error handling | Unhandled exceptions if the Gemini API fails (rate limit, network error, etc.) | --- ## Setup 1. Add your `HF_TOKEN` as a Hugging Face Space secret (Settings -> Variables and secrets). - Generate a token at https://huggingface.co/settings/tokens (read access is sufficient) - This is required to use the HF Inference API without rate limits 2. Dependencies are in `requirements.txt`: ``` gradio>=4.0.0 huggingface_hub>=0.33.0 sentence-transformers>=2.7.0 faiss-cpu>=1.8.0 pdfplumber>=0.11.0 ``` 3. The Space will auto-launch `app.py` on startup. ## LLM The app uses **`meta-llama/Llama-3.2-3B-Instruct`** via the Hugging Face Inference API (serverless). No GPU required — inference runs on HF hosted infrastructure.