Spaces:
Sleeping
Sleeping
| title: MediScribe AI | |
| emoji: π₯ | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: gradio | |
| sdk_version: 6.14.0 | |
| python_version: '3.13' | |
| app_file: app.py | |
| pinned: false | |
| # MediScribe AI | |
| AI-powered medical documentation assistant for the **Gemma 4 for Good** hackathon. | |
| Record a doctor-patient consultation via your browser mic. MediScribe transcribes it, repairs ASR errors, extracts structured clinical data, and generates a professional SOAP note and patient summary β powered by Gemma 4. | |
| ## Features | |
| - **Browser mic recording** β no software install needed | |
| - **Transcript repair + speaker labelling** via Gemma 4 | |
| - **Structured symptom extraction** via Gemma 4 function calling | |
| - **RAG-grounded SOAP notes** with ICD-10 codes and WHO drug references | |
| - **Multimodal document analysis** β upload lab results or prescriptions | |
| - **Patient records** stored in SQLite | |
| ## Setup (local) | |
| ### 1. Install dependencies | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| ### 2. Configure environment | |
| ```bash | |
| cp .env.example .env | |
| # Add your Google AI Studio API key | |
| ``` | |
| Get a free key at https://aistudio.google.com | |
| ### 3. Run | |
| ```bash | |
| python app.py | |
| ``` | |
| ## Hugging Face Spaces | |
| Set `GEMINI_API_KEY` as a Space secret in Settings β Variables and secrets. | |
| ## Project Structure | |
| ``` | |
| βββ app.py # Gradio UI + app logic | |
| βββ agents/ | |
| β βββ symptom_agent.py # Symptom extractor (Gemma 4 function calling) | |
| β βββ cloud_agents.py # SOAP, summary, transcript repair, document analysis | |
| βββ transcription/ | |
| β βββ transcriber.py # faster-whisper batch transcription | |
| βββ rag/ | |
| β βββ retriever.py # ChromaDB + sentence-transformers RAG | |
| β βββ data/ # ICD-10 codes + WHO essential medicines | |
| βββ database/ | |
| β βββ db.py # SQLite helpers | |
| βββ requirements.txt | |
| ``` | |
| ## Architecture | |
| ``` | |
| Browser Mic | |
| βββΊ faster-whisper (CPU) β raw transcript | |
| βββΊ Gemma 4 26B (API) β cleaned transcript + speaker labels | |
| βββΊ Gemma 4 function calling β structured symptom JSON | |
| βββΊ ChromaDB RAG β ICD-10 codes + drug dosages | |
| βββΊ Gemma 4 reasoning mode β SOAP note + patient summary | |
| βββΊ SQLite β patient records | |
| ``` | |