Spaces:

MikelWL
/

ConverTA

Running

App Files Files Community

MikelWL commited on 6 days ago

Commit

d9fdc34

1 Parent(s): a4d5849

Add HF deploy docs and local Docker runner

Browse files

Files changed (5) hide show

.gitignore +11 -1
PLAN.md +101 -0
docs/hf.md +58 -0
run_docker_local.sh +24 -0
start_hf_space.sh +0 -0

.gitignore CHANGED Viewed

@@ -1,2 +1,12 @@
 *.pyc
-.env

 *.pyc
+__pycache__/
+.pytest_cache/
+.mypy_cache/
+.ruff_cache/
+.venv/
+.DS_Store
+.env
+.env.*
+!.env.example

PLAN.md ADDED Viewed

	@@ -0,0 +1,101 @@

+# PLAN.md — ConverTA Next Deliverables
+This plan captures two deliverables requested after the PI demo, with the current hosting target being Hugging Face Spaces (Docker).
+## Deliverables
+### 1) Configuration Panel (Personas + Prompt Tweaks)
+**Goal:** Let a user select existing surveyor/patient personas and adjust prompt/model parameters used to start a conversation.
+**User-visible outcomes**
+- UI panel to choose `surveyor_persona_id` and `patient_persona_id` from saved personas.
+- Optional prompt overrides (system prompt additions/edits) and model settings (e.g., temperature, model id) that affect the conversation run.
+**Primary risks**
+- Keeping UI state in sync with backend state if a conversation is already running.
+- Avoiding “configuration drift” between what user selected and what was actually sent.
+### 2) Resource Agent Panel (Post-Conversation Insights via Resource Agent Prompt)
+**Goal (higher priority):** After a conversation completes, run a dedicated “resource agent” analysis (an additional LLM call) and render structured insights in the Resources panel.
+**User-visible outcomes**
+- Resources panel populates automatically at conversation end with:
+  - Patient health situation(s) mentioned + supporting evidence snippets.
+  - Care experience evaluation (good/bad/neutral) + reasons + evidence snippets.
+- Clear status UI: `idle → running → complete` and error handling.
+**Primary risks**
+- Reliably detecting “conversation ended” (stop button, backend status, disconnect, timeout).
+- Capturing a complete transcript (including persona metadata).
+- Latency/cost of extra LLM call and its failure modes.
+## Proposed Implementation Order
+1. **Deliverable #2 — Slice 1 (plumbing, end-to-end)**
+2. **Deliverable #1 — Minimal configuration UI (persona selection)**
+3. **Deliverable #2 — Slice 2 (quality, schema, robustness)**
+4. **Deliverable #1 — Advanced configuration (prompt edits + model params)**
+Rationale: implement the highest-value path (#2) first but in thin slices, so we get a fast “works end-to-end” demo without blocking on perfect configuration UX.
+## Milestones and Acceptance Criteria
+### Milestone A — Transcript Capture + End-of-Conversation Trigger (for #2)
+- Transcript stored per `conversation_id` with ordered messages and persona metadata.
+- Trigger condition (MVP): run analysis on `conversation_status: completed` only.
+- Transcript scope (MVP): include utterances only (no system prompts, no routing events).
+- Acceptance: transcript matches the conversation shown in the Messages panel.
+### Milestone B — Resource Agent LLM Call (for #2)
+- On end-of-conversation, trigger analysis request once per conversation.
+- Use the same underlying model as the conversation by default; the “resource agent” difference is the system prompt/context and desired output schema (not a different provider/model).
+- Acceptance: Resources panel shows an analysis result for a completed conversation.
+### Milestone C — Structured Output + UI Rendering (for #2)
+- Resource agent prompt requests a strict JSON schema (validated on receipt).
+- Evidence best practice (MVP): the model returns evidence pointers into the transcript and the app extracts exact evidence snippets programmatically.
+- UI renders:
+  - `health_situation`: list of items with `summary`, `evidence[]`, `confidence`.
+  - `care_experience`: `sentiment`, `reasons[]`, `evidence[]`, `confidence`.
+- Acceptance: output is stable across runs (no random placeholders), and errors are displayed without breaking the app.
+### Milestone D — Minimal Configuration Panel (for #1)
+- UI fetches available personas and allows selecting:
+  - surveyor persona
+  - patient persona
+- Selection affects the next “Start Conversation” payload.
+- Acceptance: changing persona selection changes the personas used in the conversation.
+### Milestone E — Prompt/Model Overrides (for #1)
+- UI supports optional overrides:
+  - per-agent prompt additions
+  - model id, temperature, max tokens (as supported)
+- Acceptance: overrides are visible in logs and reflected in the conversation behavior.
+## Technical Notes (Current Stack Reality)
+- The deployed Space runs a single FastAPI server (`frontend/react_gradio_hybrid.py`) and mounts the backend under `/api`.
+- Frontend communicates via `/ws/frontend/{conversation_id}` and bridges to `/api/ws/conversation/{conversation_id}` using `WebSocketManager`.
+## Design Decisions to Make Early
+1. **Where does the resource agent run?**
+   - Decision (MVP): backend-side analysis function so logic is not duplicated and can be reused by any UI.
+2. **How do we store transcript?**
+   - Decision (MVP): in-memory per conversation (fast path), with a clear reset on new conversation.
+   - Persistence (future): save transcript + analysis; start with simple file/JSONL and move to DB later.
+3. **How do we configure analysis vs conversation?**
+   - Default: reuse `LLM_*` for model/provider and only vary prompt/context for the resource agent.
+   - Optional (future): allow separate `RESOURCE_LLM_*` overrides to pick a different model/provider for analysis.
+4. **How do we render partial results?**
+   - Optional: stream analysis updates (later); initial version can be single-shot.
+## Open Questions
+- Do we want multiple “resource agent” passes (health vs care) or one combined prompt returning two sections?
+- Should users be able to rerun analysis with different prompts/models from the config panel? (out of MVP scope)
+- What’s the minimal on-disk persistence format we want first (JSON per conversation vs JSONL append)?
+- Versioning (MVP): store `schema_version`, `analysis_prompt_version`, and `app_version` (git SHA) with each analysis record.

docs/hf.md ADDED Viewed

	@@ -0,0 +1,58 @@

+# Hugging Face Spaces (Docker) — Deploy + Debug
+This project is deployed as a Hugging Face Space using the Docker SDK.
+## One-time setup (Space UI)
+Space: `https://huggingface.co/spaces/MikelWL/ConverTA`
+In Space → Settings → Variables and secrets:
+**Secrets**
+- `LLM_API_KEY`: OpenRouter API key
+**Variables**
+- `LLM_BACKEND`: `openrouter`
+- `LLM_HOST`: `https://openrouter.ai/api/v1`
+- `LLM_MODEL`: e.g. `google/gemini-3-flash-preview`
+- `LLM_SITE_URL`: `https://huggingface.co/spaces/MikelWL/ConverTA` (optional)
+- `LLM_APP_NAME`: `ConverTA` (optional)
+- `FRONTEND_WEBSOCKET_URL`: `ws://127.0.0.1:7860/api/ws/conversation`
+- `FRONTEND_BACKEND_BASE_URL`: `http://127.0.0.1:7860/api` (optional)
+Restart the Space after changing secrets/variables.
+## Local smoke test (HF-like)
+Run the Docker image locally before pushing to HF:
+```bash
+./run_docker_local.sh
+```
+Then open `http://localhost:7860` and click **Start Conversation**.
+## Deploy (push to Space repo)
+The Space is also configured as a git remote locally (`hf`).
+```bash
+git push hf main
+```
+If the Space repo ever gets reset/recreated and your push is rejected with “fetch first”, use:
+```bash
+git push --force hf main
+```
+## Troubleshooting
+- **UI loads but QA Monitor shows “Failed to connect to backend”**
+  - Ensure `FRONTEND_WEBSOCKET_URL` is set to `ws://127.0.0.1:7860/api/ws/conversation`.
+- **Space crashes on startup**
+  - Check Space → Logs for the Python traceback.
+  - Confirm `PORT` is being respected (HF sets it automatically; we bind to `0.0.0.0:$PORT`).
+- **OpenRouter errors**
+  - Confirm `LLM_API_KEY` secret is set and `LLM_MODEL` is valid on OpenRouter.

run_docker_local.sh ADDED Viewed

	@@ -0,0 +1,24 @@

+#!/usr/bin/env bash
+set -Eeuo pipefail
+IMAGE_NAME="${IMAGE_NAME:-converta:local}"
+HOST_PORT="${HOST_PORT:-7860}"
+CONTAINER_PORT="${CONTAINER_PORT:-7860}"
+ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+echo "Building Docker image: ${IMAGE_NAME}"
+docker build -t "${IMAGE_NAME}" "${ROOT_DIR}"
+ENV_ARGS=()
+if [[ -f "${ROOT_DIR}/.env" ]]; then
+  ENV_ARGS+=(--env-file "${ROOT_DIR}/.env")
+fi
+echo "Running container on http://localhost:${HOST_PORT}"
+exec docker run --rm -it \
+  -p "${HOST_PORT}:${CONTAINER_PORT}" \
+  -e "PORT=${CONTAINER_PORT}" \
+  "${ENV_ARGS[@]}" \
+  "${IMAGE_NAME}"

start_hf_space.sh CHANGED Viewed

File without changes