File size: 6,824 Bytes
e977d87
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4e16e37
e977d87
 
4e16e37
e977d87
4e16e37
 
 
e977d87
4e16e37
e977d87
4e16e37
 
 
e977d87
4e16e37
e977d87
4e16e37
e977d87
4e16e37
 
 
e977d87
4e16e37
 
 
 
e977d87
 
4e16e37
e977d87
 
4e16e37
 
 
 
 
 
 
e977d87
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
---
title: Clinical Intake Agent
emoji: πŸ₯
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
---

# Clinical Intake Agent

A LangGraph-based conversational agent for conducting pre-visit clinical intakes with simulated patients. The agent generates a structured ClinicalBrief (Chief Complaint, HPI, ROS) at the end of the conversation.

## Features

- **Multi-turn conversation** with stateful memory using LangGraph checkpointing
- **Structured clinical data collection**: Chief Complaint, HPI (OPQRST), and ROS
- **Conditional ROS scoping**: Adapts review of systems based on chief complaint
- **Vague answer handling**: Gracefully re-prompts when patient responses are unclear
- **Dual mode**: Runs as FastAPI web app OR CLI tool
- **Mock/Real LLM**: Switch between mock responses and real local LLM via environment variable

## Architecture

```
Patient β†’ triage_node β†’ agent_node β†’ (done or loop back for next question)
```

### Inference Engine

- **Local dev (mock)**: `MOCK_LLM=true` β€” regex-based MockLLM, 0ms latency
- **Production**: `MOCK_LLM=false` β€” **Ollama** local server (`qwen2.5:0.5b`, C++ optimized)
  - ~2s per turn on CPU vs 25s with raw PyTorch

### State Graph Nodes

1. **triage_node**: Detects acute emergency phrases β†’ immediate 🚨 alert
2. **agent_node**: Single LLM call β€” extracts all HPI/ROS fields AND generates next question  
   When all fields complete, builds ClinicalBrief inline (no extra LLM call)

## Deployment on Hugging Face Spaces

This repo is configured as a **Docker SDK Space**. On every push:

1. Docker image builds β€” Ollama gets installed via official install script
2. `startup.sh` starts on container boot: launches Ollama, pulls `qwen2.5:0.5b`, starts FastAPI
3. App is live on port 7860

```bash
# Test the Docker build locally before pushing
docker build -t clinical-intake .
docker run -p 7860:7860 clinical-intake
```

## Local Development

```bash
# Fast mock mode (no model needed, instant responses)
MOCK_LLM=true uvicorn app.main:app --reload

# Real Ollama mode β€” requires Ollama installed at localhost:11434
ollama serve &
ollama pull qwen2.5:0.5b
MOCK_LLM=false uvicorn app.main:app --reload
```

## Usage

### FastAPI Web App

#### Health Check
```bash
curl http://localhost:7860/health
# Response: {"status": "ok", "mock_mode": true}
```

#### Chat Endpoint
```bash
# Start conversation
curl -X POST http://localhost:7860/chat \
  -H "Content-Type: application/json" \
  -d '{"session_id": "patient123", "message": "hello"}'

# Continue conversation
curl -X POST http://localhost:7860/chat \
  -H "Content-Type: application/json" \
  -d '{"session_id": "patient123", "message": "I have chest pain"}'

# Final response includes clinical_brief when state == "done"
```

### CLI Mode

```bash
# Run interactive CLI
python app/main.py --cli

# Example session:
# Agent: Hello! I'm here to help you with your pre-visit intake. What brings you in today?
# You: I have chest pain since this morning
# Agent: I understand you're experiencing chest pain. When did it first start?
# ... (continues through HPI and ROS) ...
# Agent: Your clinical intake is complete. Here is your summary:
# {
#   "chief_complaint": "chest pain",
#   "hpi": {...},
#   "ros": {...},
#   "generated_at": "2024-01-15T10:30:00Z"
# }
```

## API Reference

### POST /chat

**Request:**
```json
{
  "session_id": "string",
  "message": "string"
}
```

**Response:**
```json
{
  "reply": "string",
  "state": "intake|hpi|ros|brief_generation|done",
  "brief": {
    "chief_complaint": "string",
    "hpi": {
      "onset": "string",
      "location": "string",
      "duration": "string",
      "character": "string",
      "severity": "string",
      "aggravating": "string",
      "relieving": "string"
    },
    "ros": {
      "system_name": ["finding1", "finding2"]
    },
    "generated_at": "ISO8601 timestamp"
  }
}
```

### GET /health

**Response:**
```json
{
  "status": "ok",
  "mock_mode": true
}
```

## Configuration

| Environment Variable | Description | Default |
|---------------------|-------------|---------|
| `MOCK_LLM` | Use mock LLM responses (`true`) or real local LLM (`false`) | `true` |
| `MODEL_PATH` | Path to GGUF model file (used when `MOCK_LLM=false`) | `/models/qwen2.5-0.5b-instruct-q4_k_m.gguf` |

## Testing

```bash
# Run all tests (uses MockLLM automatically)
pytest tests/

# Run specific test
pytest tests/test_e2e.py::test_full_intake_flow -v

# Run with coverage
pytest --cov=app tests/
```

### Test Coverage

- βœ… `test_health_endpoint`: Verifies health check returns mock_mode status
- βœ… `test_full_intake_flow`: Complete conversation flow from greeting to ClinicalBrief
- βœ… `test_hpi_reprompt`: Validates vague answer re-prompting behavior
- βœ… `test_ros_scoping`: Confirms ROS systems are scoped based on chief complaint
- βœ… `test_brief_structure`: Validates ClinicalBrief Pydantic schema compliance

## Project Structure

```
clinical-intake-agent/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ main.py          # FastAPI app + CLI entry point
β”‚   β”œβ”€β”€ graph.py         # LangGraph state graph and nodes
β”‚   β”œβ”€β”€ state.py         # TypedDict state definitions
β”‚   β”œβ”€β”€ schemas.py       # Pydantic models (HPI, ClinicalBrief)
β”‚   └── llm.py           # LLM provider (MockLLM, RealLLM)
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── test_e2e.py      # End-to-end tests
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ README.md
```

## Dependencies

Minimal dependencies (no heavy ML libraries unless `MOCK_LLM=false`):

- `langgraph` - State graph orchestration
- `fastapi` - Web framework
- `uvicorn` - ASGI server
- `pydantic` - Data validation
- `pytest` + `pytest-asyncio` - Testing
- `httpx` - Async HTTP client for tests
- `llama-cpp-python` - Only in Docker prod layer for real LLM mode

## License

MIT

## Contributing

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit changes (`git commit -m 'Add amazing feature'`)
4. Push to branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

## Troubleshooting

### Model Download Fails

If running with `MOCK_LLM=false` and the model fails to download:

```bash
# Manually download the model
python -c "from huggingface_hub import hf_hub_download; hf_hub_download('bartowski/Qwen2.5-0.5B-Instruct-GGUF', 'Qwen2.5-0.5B-Instruct-Q4_K_M.gguf', local_dir='/models')"
```

### Session State Not Persisting

Ensure you're using the same `session_id` across multiple `/chat` calls. Sessions are stored in-memory per process.

### Docker Build Fails

The Dockerfile skips model download if `MOCK_LLM=true`. To force model download in Docker:

```bash
docker build --build-arg MOCK_LLM=false -t clinical-intake-agent .
```