Spaces:

CAPS-IDI
/

caps-chatbot-internal

Sleeping

App Files Files Community

caps-chatbot-internal / README.md

atwine

Switch LLM from Gemini to Llama-3.2-3B-Instruct via HF Inference API

a937e6c 13 days ago

preview code

raw

history blame contribute delete

5.27 kB

A newer version of the Gradio SDK is available: 6.16.0

Upgrade

metadata

title: Caps Chatbot Internal
emoji: 💬
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 6.5.1
app_file: app.py
pinned: false
hf_oauth: true
hf_oauth_scopes:
  - inference-api
license: apache-2.0
short_description: CAPS Chatbot — Internal Review Portal Co-designed AI peer su

CAPS Chatbot — Sanyu (Internal Review Portal)

Co-designed AI peer support for adolescents and young people living with HIV | Expert safety review — not for clinical use.

What this project is

Sanyu is a co-designed AI peer support chatbot for adolescents and young people living with HIV (AYPLHIV) aged 15–24 in Uganda. Built by CAPS-IDI, this is an internal review/prototype portal — not yet approved for clinical or public use.

Tech Stack

Layer	Choice
Frontend/UI	Gradio (`gr.ChatInterface`)
LLM	Google Gemini 2.5 Flash via `google-genai` SDK
Auth	Hugging Face OAuth (`hf_oauth: true`)
Hosting	Hugging Face Spaces (Gradio SDK)
Python deps	`gradio>=4.0.0`, `google-genai`

How the App Works

META_PROMPT — A detailed (~370-line) system prompt defining Sanyu's persona, tone, content knowledge, and behavioral rules.
extract_text(content) — Utility to handle both plain strings and Gradio's structured [{"type": "text", ...}] message format.
respond(message, history) — The chat handler. Converts Gradio's history (supports both dict-format and tuple-format) into Gemini types.Content objects, appends the new user message, then calls client.models.generate_content() with the system prompt injected via GenerateContentConfig.
gr.ChatInterface — Wraps respond into a simple web UI with title and description.
API key — Loaded from the GOOGLE_API_KEY environment variable (set as a Hugging Face Space secret).

System Prompt Design

The META_PROMPT is the intellectual core of the project. It was co-designed with AYPLHIV and health workers through modified Delphi consensus workshops. It encodes:

12-Dimension Voice Matrix

Empathy & Understanding First — acknowledge emotions before giving information
Non-Judgmental Language — no blame, no "why didn't you…"
User Agency — present options, not directives; user is the decision-maker
Patience / No Time Pressure — never rush; let the user lead the pace
Concise by Default — 2–4 sentences; no walls of text
Warm but Not Frivolous — peer-like language, match the user's energy
Empowerment & Capacity Building — build confidence and self-advocacy over time
Comfort & Reassurance — affirming, hopeful, counter internalized stigma
Structured Guidance When Requested — numbered steps for "how do I…" questions
Evidence-Based with Conversational Delivery — factual but accessible; Uganda-specific context
Progressive / Realistic Goals — graduated steps, not all-or-nothing advice
Storytelling as Support Tool — anonymised vignettes to illustrate how others cope

Content Domains

Medication adherence — barriers, practical strategies, non-shaming approach
Disclosure strategies — multiple approaches, user-led, safety-first
Mental health & self-stigma — normalisation, affirmations, self-acceptance
Sexual & reproductive health — contraception, STIs, pregnancy, SRH rights
Relationships — romantic partners, family, peer dynamics
GBV safety protocols — crisis detection, escalation triggers, referral pathways

Safety & Limits

Hard boundary: no medical prescriptions
Crisis triggers (suicidal ideation, active abuse, safety risk) → immediate escalation prompt
Always refers complex/crisis cases to human counsellors and peer supporters
"Referral is a feature, not a failure."

Language & Accessibility

Default: English; Luganda code-switching accepted
Plain language targeting ~8 years of education
Age-adapted: different tone/content for 14–17 vs 18–24 year olds
Few-shot examples from real counselling dialogues embedded in the prompt

Known Limitations

Issue	Detail
No persistent memory	The prompt requires remembering users across sessions, but there is no database or session storage — memory only lasts within a single Gradio session
No streaming	`generate_content()` is synchronous — users see nothing until the full response is ready
No error handling	Unhandled exceptions if the Gemini API fails (rate limit, network error, etc.)

Setup

Add your HF_TOKEN as a Hugging Face Space secret (Settings -> Variables and secrets).
- Generate a token at https://huggingface.co/settings/tokens (read access is sufficient)
- This is required to use the HF Inference API without rate limits

Dependencies are in requirements.txt:

gradio>=4.0.0
huggingface_hub>=0.33.0
sentence-transformers>=2.7.0
faiss-cpu>=1.8.0
pdfplumber>=0.11.0

The Space will auto-launch app.py on startup.

LLM

The app uses meta-llama/Llama-3.2-3B-Instruct via the Hugging Face Inference API (serverless). No GPU required — inference runs on HF hosted infrastructure.