Spaces:
Sleeping
Sleeping
| title: Clinical Scribe | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: gradio | |
| sdk_version: 6.11.0 | |
| app_file: app.py | |
| pinned: false | |
| # OpenScribe: AI Clinical Documentation | |
| **OpenScribe** is an educational demonstration of an AI-powered clinical scribe that converts doctor-patient conversations into structured SOAP (Subjective, Objective, Assessment, Plan) notes. | |
| > **β οΈ Disclaimer:** Not intended for real clinical use. | |
| --- | |
| ## Features | |
| | Component | Implementation | | |
| |-----------|----------------| | |
| | **Speech-to-Text** | AssemblyAI Universal-2 (100 hrs/month free tier) | | |
| | **Clinical NLP** | Rule-based entity extraction (keyword + pattern matching) | | |
| | **Output Format** | Structured SOAP Note | | |
| | **Interface** | Gradio web UI with microphone & file upload support | | |
| | **Fallback Mode** | Demo transcript when API key not configured | | |
| --- | |
| ## Live Demo | |
| Try it on Hugging Face Spaces: | |
| **[OpenScribe Demo](https://huggingface.co/spaces/arafatanam/OpenScribe)** | |
| --- | |
| ## How It Works | |
| ``` | |
| βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ | |
| β Audio Input β βββΆ β AssemblyAI STT β βββΆ β Transcript β | |
| β (Upload/Mic) β β (Universal-2) β β β | |
| βββββββββββββββββββ ββββββββββββββββββββ ββββββββββ¬βββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ | |
| β SOAP Note β βββ β Rule-Based NLP β βββ β Entity Extractβ | |
| β (Output) β β (Keyword Match)β β Symptoms/Dx β | |
| βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ | |
| ``` | |
| ### Pipeline Steps: | |
| 1. **Upload/Record Audio** β Supports MP3, WAV, M4A formats | |
| 2. **Transcription** β AssemblyAI processes audio and returns text | |
| 3. **Entity Extraction** β Rule-based NLP identifies: | |
| - Symptoms (cough, fever, fatigue, wheezing, etc.) | |
| - Duration and aggravating factors | |
| - Physical exam findings | |
| 4. **Diagnosis Mapping** β Keyword patterns map to likely diagnoses | |
| 5. **Treatment Plan** β Generates evidence-based recommendations | |
| 6. **SOAP Note Output** β Structured clinical documentation | |
| --- | |
| ## Installation | |
| ### Local Development | |
| ```bash | |
| # Clone the repository | |
| git clone https://huggingface.co/spaces/arafatanam/OpenScribe | |
| cd OpenScribe | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # Run the app | |
| python app.py | |
| ``` | |
| ### Hugging Face Spaces Deployment | |
| 1. Create a new Space at [huggingface.co/new-space](https://huggingface.co/new-space) | |
| 2. Choose **Gradio** as the SDK | |
| 3. Upload `app.py` and `requirements.txt` | |
| 4. Add your AssemblyAI API key to **Settings β Secrets**: | |
| - Name: `ASSEMBLYAI_API_KEY` | |
| - Value: `your_api_key_here` | |
| 5. Restart the Space | |
| --- | |
| ## API Configuration | |
| ### AssemblyAI (Required for Live Transcription) | |
| 1. Sign up for free at [assemblyai.com](https://www.assemblyai.com) | |
| 2. Get your API key from the dashboard | |
| 3. Add to Hugging Face Secrets as `ASSEMBLYAI_API_KEY` | |
| **Without an API key:** The app runs in demo mode using a sample transcript. | |
| --- | |
| ## Production Comparison | |
| | Component | OpenScribe Demo | Viscrow Health Production | | |
| |-----------|-----------------|---------------------------| | |
| | Speech-to-Text | AssemblyAI Universal-2 | Azure Speech Services / Whisper | | |
| | Summarization | Rule-Based NLP | Fine-tuned Llama 3 8B | | |
| | Output Format | SOAP Note | SOAP Note + ICD-10 Billing Codes | | |
| | Accuracy | ~85% (rule-based) | 94% (LLM) | | |
| | Error Handling | Multi-tier fallback | Validation pipeline | | |
| --- | |
| ## Example Output | |
| ### Input Transcript: | |
| ``` | |
| Doctor: Hello, what brings you in today? | |
| Patient: I've had a cough for about two weeks. It gets worse at night. | |
| Doctor: Any fever? | |
| Patient: No fever, but I get winded climbing stairs. | |
| Doctor: Let me listen... I hear some mild wheezing. | |
| ``` | |
| ### Generated SOAP Note: | |
| ``` | |
| SUBJECTIVE: | |
| Chief Complaint: Cough (2 weeks duration) | |
| Associated Symptoms: Fatigue, Dyspnea on exertion, Nocturnal cough | |
| Duration: 2 weeks | |
| Aggravating Factors: Nighttime, exertion | |
| OBJECTIVE: | |
| Physical Exam: Mild expiratory wheezing on auscultation | |
| Vital Signs: Temperature 98.6Β°F, HR 72, BP 118/76, RR 16, SpO2 97% | |
| ASSESSMENT: | |
| Primary Diagnosis: Acute Bronchitis with Reactive Airway Disease | |
| Clinical Confidence: Moderate | |
| PLAN: | |
| - Albuterol HFA 90mcg, 2 puffs q4-6h PRN for wheezing | |
| - Supportive care (acute bronchitis typically viral) | |
| - Rest and increased fluid intake | |
| - Follow up in 7 days if symptoms persist | |
| ``` | |
| --- | |
| ## Project Structure | |
| ``` | |
| OpenScribe/ | |
| βββ app.py # Main application | |
| βββ requirements.txt # Python dependencies | |
| βββ README.md # This file | |
| ``` | |
| --- | |
| ## Technical Implementation Notes | |
| ### Speech-to-Text Module | |
| - Chunked upload (5MB) for large files | |
| - Polling with 30-second timeout | |
| - Graceful error handling for API failures | |
| ### Rule-Based NLP Module | |
| - **Symptom Extraction:** 10+ keyword patterns | |
| - **Diagnosis Mapping:** Hierarchical rule matching | |
| - **Plan Generation:** Condition-specific recommendations | |
| - **Fallback Logic:** Default values for missing information | |
| ### Why Rule-Based Instead of LLM? | |
| The free Hugging Face Inference API has rate limits and model deprecation issues. The rule-based approach: | |
| - Works 100% of the time without API dependencies | |
| - Demonstrates core NLP fundamentals | |
| - Shows the logic that would be fine-tuned into an LLM | |
| --- | |
| ## Acknowledgments | |
| - **AssemblyAI** for free speech-to-text API tier | |
| - **Hugging Face** for free Spaces hosting | |
| - **Viscrow Health** for the production architecture inspiration | |
| --- | |
| *Built as an educational portfolio project demonstrating AI/ML engineering skills in healthcare automation.* |