File size: 2,376 Bytes
e1242f6
 
 
 
 
 
084e080
 
e1242f6
 
 
 
 
c32bf13
 
 
e1242f6
c32bf13
 
 
e1242f6
 
 
 
 
 
c32bf13
e1242f6
c32bf13
 
 
 
 
 
 
e1242f6
c32bf13
 
 
e1242f6
c32bf13
 
e1242f6
c32bf13
e1242f6
c32bf13
 
 
 
 
e1242f6
 
 
c32bf13
 
 
 
 
 
e1242f6
 
c32bf13
e1242f6
 
 
 
c32bf13
 
e1242f6
c32bf13
 
 
 
 
e1242f6
 
 
 
 
 
 
c32bf13
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
title: MediScribe AI
emoji: πŸ₯
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 6.14.0
python_version: '3.13'
app_file: app.py
pinned: false
---

# MediScribe AI

AI-powered medical documentation assistant for the **Gemma 4 for Good** hackathon.

Record a doctor-patient consultation via your browser mic. MediScribe transcribes it, repairs ASR errors, extracts structured clinical data, and generates a professional SOAP note and patient summary β€” powered by Gemma 4.

## Features

- **Browser mic recording** β€” no software install needed
- **Transcript repair + speaker labelling** via Gemma 4
- **Structured symptom extraction** via Gemma 4 function calling
- **RAG-grounded SOAP notes** with ICD-10 codes and WHO drug references
- **Multimodal document analysis** β€” upload lab results or prescriptions
- **Patient records** stored in SQLite

## Setup (local)

### 1. Install dependencies

```bash
pip install -r requirements.txt
```

### 2. Configure environment

```bash
cp .env.example .env
# Add your Google AI Studio API key
```

Get a free key at https://aistudio.google.com

### 3. Run

```bash
python app.py
```

## Hugging Face Spaces

Set `GEMINI_API_KEY` as a Space secret in Settings β†’ Variables and secrets.

## Project Structure

```
β”œβ”€β”€ app.py                      # Gradio UI + app logic
β”œβ”€β”€ agents/
β”‚   β”œβ”€β”€ symptom_agent.py        # Symptom extractor (Gemma 4 function calling)
β”‚   └── cloud_agents.py         # SOAP, summary, transcript repair, document analysis
β”œβ”€β”€ transcription/
β”‚   └── transcriber.py          # faster-whisper batch transcription
β”œβ”€β”€ rag/
β”‚   β”œβ”€β”€ retriever.py            # ChromaDB + sentence-transformers RAG
β”‚   └── data/                   # ICD-10 codes + WHO essential medicines
β”œβ”€β”€ database/
β”‚   └── db.py                   # SQLite helpers
└── requirements.txt
```

## Architecture

```
Browser Mic
  └─► faster-whisper (CPU)      β†’ raw transcript
        └─► Gemma 4 26B (API)   β†’ cleaned transcript + speaker labels
              β”œβ”€β–Ί Gemma 4 function calling β†’ structured symptom JSON
              β”œβ”€β–Ί ChromaDB RAG  β†’ ICD-10 codes + drug dosages
              └─► Gemma 4 reasoning mode β†’ SOAP note + patient summary
                    └─► SQLite  β†’ patient records
```