Mediscribe / README.md
Fred-Rcky's picture
Merge HF Space init and resolve README conflict
4b668a5
|
Raw
History Blame Contribute Delete
2.38 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade
metadata
title: MediScribe AI
emoji: πŸ₯
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 6.14.0
python_version: '3.13'
app_file: app.py
pinned: false

MediScribe AI

AI-powered medical documentation assistant for the Gemma 4 for Good hackathon.

Record a doctor-patient consultation via your browser mic. MediScribe transcribes it, repairs ASR errors, extracts structured clinical data, and generates a professional SOAP note and patient summary β€” powered by Gemma 4.

Features

  • Browser mic recording β€” no software install needed
  • Transcript repair + speaker labelling via Gemma 4
  • Structured symptom extraction via Gemma 4 function calling
  • RAG-grounded SOAP notes with ICD-10 codes and WHO drug references
  • Multimodal document analysis β€” upload lab results or prescriptions
  • Patient records stored in SQLite

Setup (local)

1. Install dependencies

pip install -r requirements.txt

2. Configure environment

cp .env.example .env
# Add your Google AI Studio API key

Get a free key at https://aistudio.google.com

3. Run

python app.py

Hugging Face Spaces

Set GEMINI_API_KEY as a Space secret in Settings β†’ Variables and secrets.

Project Structure

β”œβ”€β”€ app.py                      # Gradio UI + app logic
β”œβ”€β”€ agents/
β”‚   β”œβ”€β”€ symptom_agent.py        # Symptom extractor (Gemma 4 function calling)
β”‚   └── cloud_agents.py         # SOAP, summary, transcript repair, document analysis
β”œβ”€β”€ transcription/
β”‚   └── transcriber.py          # faster-whisper batch transcription
β”œβ”€β”€ rag/
β”‚   β”œβ”€β”€ retriever.py            # ChromaDB + sentence-transformers RAG
β”‚   └── data/                   # ICD-10 codes + WHO essential medicines
β”œβ”€β”€ database/
β”‚   └── db.py                   # SQLite helpers
└── requirements.txt

Architecture

Browser Mic
  └─► faster-whisper (CPU)      β†’ raw transcript
        └─► Gemma 4 26B (API)   β†’ cleaned transcript + speaker labels
              β”œβ”€β–Ί Gemma 4 function calling β†’ structured symptom JSON
              β”œβ”€β–Ί ChromaDB RAG  β†’ ICD-10 codes + drug dosages
              └─► Gemma 4 reasoning mode β†’ SOAP note + patient summary
                    └─► SQLite  β†’ patient records