Spaces:
Sleeping
Sleeping
File size: 6,824 Bytes
e977d87 4e16e37 e977d87 4e16e37 e977d87 4e16e37 e977d87 4e16e37 e977d87 4e16e37 e977d87 4e16e37 e977d87 4e16e37 e977d87 4e16e37 e977d87 4e16e37 e977d87 4e16e37 e977d87 4e16e37 e977d87 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 | ---
title: Clinical Intake Agent
emoji: π₯
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
---
# Clinical Intake Agent
A LangGraph-based conversational agent for conducting pre-visit clinical intakes with simulated patients. The agent generates a structured ClinicalBrief (Chief Complaint, HPI, ROS) at the end of the conversation.
## Features
- **Multi-turn conversation** with stateful memory using LangGraph checkpointing
- **Structured clinical data collection**: Chief Complaint, HPI (OPQRST), and ROS
- **Conditional ROS scoping**: Adapts review of systems based on chief complaint
- **Vague answer handling**: Gracefully re-prompts when patient responses are unclear
- **Dual mode**: Runs as FastAPI web app OR CLI tool
- **Mock/Real LLM**: Switch between mock responses and real local LLM via environment variable
## Architecture
```
Patient β triage_node β agent_node β (done or loop back for next question)
```
### Inference Engine
- **Local dev (mock)**: `MOCK_LLM=true` β regex-based MockLLM, 0ms latency
- **Production**: `MOCK_LLM=false` β **Ollama** local server (`qwen2.5:0.5b`, C++ optimized)
- ~2s per turn on CPU vs 25s with raw PyTorch
### State Graph Nodes
1. **triage_node**: Detects acute emergency phrases β immediate π¨ alert
2. **agent_node**: Single LLM call β extracts all HPI/ROS fields AND generates next question
When all fields complete, builds ClinicalBrief inline (no extra LLM call)
## Deployment on Hugging Face Spaces
This repo is configured as a **Docker SDK Space**. On every push:
1. Docker image builds β Ollama gets installed via official install script
2. `startup.sh` starts on container boot: launches Ollama, pulls `qwen2.5:0.5b`, starts FastAPI
3. App is live on port 7860
```bash
# Test the Docker build locally before pushing
docker build -t clinical-intake .
docker run -p 7860:7860 clinical-intake
```
## Local Development
```bash
# Fast mock mode (no model needed, instant responses)
MOCK_LLM=true uvicorn app.main:app --reload
# Real Ollama mode β requires Ollama installed at localhost:11434
ollama serve &
ollama pull qwen2.5:0.5b
MOCK_LLM=false uvicorn app.main:app --reload
```
## Usage
### FastAPI Web App
#### Health Check
```bash
curl http://localhost:7860/health
# Response: {"status": "ok", "mock_mode": true}
```
#### Chat Endpoint
```bash
# Start conversation
curl -X POST http://localhost:7860/chat \
-H "Content-Type: application/json" \
-d '{"session_id": "patient123", "message": "hello"}'
# Continue conversation
curl -X POST http://localhost:7860/chat \
-H "Content-Type: application/json" \
-d '{"session_id": "patient123", "message": "I have chest pain"}'
# Final response includes clinical_brief when state == "done"
```
### CLI Mode
```bash
# Run interactive CLI
python app/main.py --cli
# Example session:
# Agent: Hello! I'm here to help you with your pre-visit intake. What brings you in today?
# You: I have chest pain since this morning
# Agent: I understand you're experiencing chest pain. When did it first start?
# ... (continues through HPI and ROS) ...
# Agent: Your clinical intake is complete. Here is your summary:
# {
# "chief_complaint": "chest pain",
# "hpi": {...},
# "ros": {...},
# "generated_at": "2024-01-15T10:30:00Z"
# }
```
## API Reference
### POST /chat
**Request:**
```json
{
"session_id": "string",
"message": "string"
}
```
**Response:**
```json
{
"reply": "string",
"state": "intake|hpi|ros|brief_generation|done",
"brief": {
"chief_complaint": "string",
"hpi": {
"onset": "string",
"location": "string",
"duration": "string",
"character": "string",
"severity": "string",
"aggravating": "string",
"relieving": "string"
},
"ros": {
"system_name": ["finding1", "finding2"]
},
"generated_at": "ISO8601 timestamp"
}
}
```
### GET /health
**Response:**
```json
{
"status": "ok",
"mock_mode": true
}
```
## Configuration
| Environment Variable | Description | Default |
|---------------------|-------------|---------|
| `MOCK_LLM` | Use mock LLM responses (`true`) or real local LLM (`false`) | `true` |
| `MODEL_PATH` | Path to GGUF model file (used when `MOCK_LLM=false`) | `/models/qwen2.5-0.5b-instruct-q4_k_m.gguf` |
## Testing
```bash
# Run all tests (uses MockLLM automatically)
pytest tests/
# Run specific test
pytest tests/test_e2e.py::test_full_intake_flow -v
# Run with coverage
pytest --cov=app tests/
```
### Test Coverage
- β
`test_health_endpoint`: Verifies health check returns mock_mode status
- β
`test_full_intake_flow`: Complete conversation flow from greeting to ClinicalBrief
- β
`test_hpi_reprompt`: Validates vague answer re-prompting behavior
- β
`test_ros_scoping`: Confirms ROS systems are scoped based on chief complaint
- β
`test_brief_structure`: Validates ClinicalBrief Pydantic schema compliance
## Project Structure
```
clinical-intake-agent/
βββ app/
β βββ __init__.py
β βββ main.py # FastAPI app + CLI entry point
β βββ graph.py # LangGraph state graph and nodes
β βββ state.py # TypedDict state definitions
β βββ schemas.py # Pydantic models (HPI, ClinicalBrief)
β βββ llm.py # LLM provider (MockLLM, RealLLM)
βββ tests/
β βββ __init__.py
β βββ test_e2e.py # End-to-end tests
βββ requirements.txt
βββ Dockerfile
βββ README.md
```
## Dependencies
Minimal dependencies (no heavy ML libraries unless `MOCK_LLM=false`):
- `langgraph` - State graph orchestration
- `fastapi` - Web framework
- `uvicorn` - ASGI server
- `pydantic` - Data validation
- `pytest` + `pytest-asyncio` - Testing
- `httpx` - Async HTTP client for tests
- `llama-cpp-python` - Only in Docker prod layer for real LLM mode
## License
MIT
## Contributing
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit changes (`git commit -m 'Add amazing feature'`)
4. Push to branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
## Troubleshooting
### Model Download Fails
If running with `MOCK_LLM=false` and the model fails to download:
```bash
# Manually download the model
python -c "from huggingface_hub import hf_hub_download; hf_hub_download('bartowski/Qwen2.5-0.5B-Instruct-GGUF', 'Qwen2.5-0.5B-Instruct-Q4_K_M.gguf', local_dir='/models')"
```
### Session State Not Persisting
Ensure you're using the same `session_id` across multiple `/chat` calls. Sessions are stored in-memory per process.
### Docker Build Fails
The Dockerfile skips model download if `MOCK_LLM=true`. To force model download in Docker:
```bash
docker build --build-arg MOCK_LLM=false -t clinical-intake-agent .
```
|