| --- |
| license: apache-2.0 |
| tags: |
| - medical |
| - clinical-notes |
| - patient-communication |
| - lora |
| - peft |
| - medgemma |
| - gguf |
| language: |
| - en |
| library_name: peft |
| --- |
| |
| # NoteExplain Models |
|
|
| Trained models for clinical note simplification - translating medical documents into patient-friendly language. |
|
|
| ## Models |
|
|
| | Model | Base | Description | Overall | Accuracy | Patient-Centered | |
| |-------|------|-------------|---------|----------|------------------| |
| | **gemma-2b-distilled** | gemma-2-2b-it | Final mobile model | 70% | 73% | **76%** | |
| | **gemma-2b-dpo** | gemma-2-2b-it | DPO comparison | **73%** | **82%** | 61% | |
| | **gemma-9b-dpo** | gemma-2-9b-it | Teacher model | 79% | 91% | 70% | |
|
|
| ## GGUF for Mobile/Local Inference |
|
|
| Pre-quantized GGUF models (Q4_K_M, ~1.6GB each) for llama.cpp, Ollama, LM Studio: |
|
|
| | File | Description | Download | |
| |------|-------------|----------| |
| | `gguf/gemma-2b-distilled-q4_k_m.gguf` | Distilled model (better patient communication) | [Download](https://huggingface.co/dejori/note-explain/resolve/main/gguf/gemma-2b-distilled-q4_k_m.gguf) | |
| | `gguf/gemma-2b-dpo-q4_k_m.gguf` | DPO model (higher accuracy) | [Download](https://huggingface.co/dejori/note-explain/resolve/main/gguf/gemma-2b-dpo-q4_k_m.gguf) | |
|
|
| ### Quick Start with Ollama |
|
|
| ```bash |
| # Download and run |
| ollama run hf.co/dejori/note-explain:gemma-2b-distilled-q4_k_m.gguf |
| ``` |
|
|
| ### Quick Start with llama.cpp |
|
|
| ```bash |
| # Download |
| wget https://huggingface.co/dejori/note-explain/resolve/main/gguf/gemma-2b-distilled-q4_k_m.gguf |
| |
| # Run |
| ./llama-cli -m gemma-2b-distilled-q4_k_m.gguf -p "Simplify this clinical note for a patient: [your note]" |
| ``` |
|
|
| ## LoRA Adapters |
|
|
| For fine-tuning or full-precision inference: |
|
|
| ```python |
| from peft import PeftModel |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| # Load the distilled model |
| base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it") |
| tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it") |
| model = PeftModel.from_pretrained(base_model, "dejori/note-explain", subfolder="gemma-2b-distilled") |
| |
| # Generate |
| prompt = "Simplify this clinical note for a patient:\n\n[clinical note]\n\nSimplified version:" |
| inputs = tokenizer(prompt, return_tensors="pt") |
| outputs = model.generate(**inputs, max_new_tokens=512) |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| ``` |
|
|
| ## Training |
|
|
| - **DPO Training**: MedGemma-27B scored 5 candidate outputs per clinical note, creating preference pairs |
| - **Distillation**: 9B-DPO model generated high-quality outputs to train the 2B model via SFT |
|
|
| ## Dataset |
|
|
| Training data: [dejori/note-explain](https://huggingface.co/datasets/dejori/note-explain) |
|
|
| ## License |
|
|
| Apache 2.0 |
|
|