Update README.md

Browse files

Files changed (1) hide show

README.md +117 -36

README.md CHANGED Viewed

@@ -1,58 +1,139 @@
 ---
 base_model: meta-llama/Llama-3.2-3B-Instruct
-library_name: transformers
-model_name: Clinical-Reasoning-Test1
 tags:
-- generated_from_trainer
-- trl
 - sft
-licence: license
 ---
-# Model Card for Clinical-Reasoning-Test1
-This model is a fine-tuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct).
-It has been trained using [TRL](https://github.com/huggingface/trl).
-## Quick start
-```python
-from transformers import pipeline
-question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
-generator = pipeline("text-generation", model="ploppy2/Clinical-Reasoning-Test1", device="cuda")
-output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
-print(output["generated_text"])
-```
-## Training procedure
-This model was trained with SFT.
-### Framework versions
-- TRL: 0.13.0
-- Transformers: 4.48.0
-- Pytorch: 2.5.1+cu121
-- Datasets: 3.2.0
-- Tokenizers: 0.21.4
-## Citations
-Cite TRL as:
 ```bibtex
-@misc{vonwerra2022trl,
-	title        = {{TRL: Transformer Reinforcement Learning}},
-	author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
-	year         = 2020,
-	journal      = {GitHub repository},
-	publisher    = {GitHub},
-	howpublished = {\url{https://github.com/huggingface/trl}}
 }
-```

 ---
+license: llama3.2
 base_model: meta-llama/Llama-3.2-3B-Instruct
 tags:
+- medical
+- clinical-reasoning
+- diagnostic
+- education
+- fine-tuned
+- lora
 - sft
+- trl
+datasets:
+- mimic-iv-ext-direct
+language:
+- en
+pipeline_tag: text-generation
 ---
+# Clinical Reasoning Model (Test 1)
+A fine-tuned version of [Llama 3.2 3B Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) trained to produce step-by-step diagnostic reasoning chains from clinical patient cases.
+## Purpose
+This model was created for **educational purposes only**. It is designed to demonstrate how a language model can walk through the clinical reasoning process, connecting patient findings (history, physical exam, labs, imaging) to a final diagnosis in a structured, step-by-step format.
+**This model is NOT intended for clinical use, patient care, or medical decision-making.**
+## What It Does
+Given a patient case (chief complaint, history, exam findings, labs, and imaging), the model produces:
+1. A final diagnosis
+2. A numbered reasoning chain that explains how each piece of clinical evidence supports or leads to that diagnosis
+### Example
+**Input:**
+> A patient presents with productive cough, fatigue, and chest congestion. History of prior TB treatment. Chest CT shows a thin-walled cavity in the right lower lobe with adjacent calcified granulomas and bronchiectasis.
+**Output:**
+> FINAL DIAGNOSIS: Tuberculosis
+>
+> Step 1: Cavities in the lungs are common in active tuberculosis, especially when the walls of the cavities are thin, indicating the possibility of active disease or reactivation of infection.
+> Supporting evidence: Superior segment right lower lobe relatively thin-walled cavity
+>
+> Step 2: The patient had been treated for tuberculosis several years earlier, which is important background information because tuberculosis can recur.
+> Supporting evidence: TB treated years ago
+>
+> Step 3: In patients with a history of tuberculosis, these symptoms may indicate activity or recurrence of tuberculosis.
+> Supporting evidence: symptoms of URI including fatigue, productive cough, runny nose, and chest congestion
+## Training Details
+### Dataset
+Trained on the [DiReCT (Diagnostic Reasoning for Clinical Notes)](https://physionet.org/content/mimic-iv-ext-direct/1.0.0/) dataset, which contains 511 clinical notes sourced from MIMIC-IV. Each note was annotated by physicians with structured diagnostic reasoning trees mapping clinical observations to final diagnoses.
+The dataset covers 25 disease categories and 73 unique diagnoses, including:
+- Acute Coronary Syndrome (NSTEMI, Unstable Angina)
+- Heart Failure (HFrEF, HFpEF)
+- Stroke (Hemorrhagic, Ischemic)
+- Pulmonary Embolism
+- Pneumonia
+- COPD
+- Multiple Sclerosis
+- Tuberculosis
+- Hypertension
+- And many more
+### Training Configuration
+| Parameter | Value |
+|---|---|
+| Base model | meta-llama/Llama-3.2-3B-Instruct |
+| Method | SFT with LoRA (PEFT) |
+| Quantization | 4-bit (NF4) |
+| LoRA rank | 16 |
+| LoRA alpha | 32 |
+| LoRA dropout | 0.05 |
+| Learning rate | 3e-5 |
+| Epochs | 3 |
+| Batch size | 1 (effective 8 with gradient accumulation) |
+| Precision | FP16 |
+| Hardware | NVIDIA T4 (Google Colab) |
+### Training Results
+The model trained for 3 epochs with a steady decrease in loss:
+| Step | Training Loss |
+|---|---|
+| 10 | 22.38 |
+| 30 | 19.23 |
+| 50 | 17.03 |
+| 70 | 15.23 |
+| 90 | 15.08 |
+| 110 | 15.07 |
+| 130 | 14.57 |
+| 150 | 13.90 |
+| 170 | 14.35 |
+| 180 | 13.71 |
+## Limitations
+- **Not for clinical use.** This model is an educational experiment and should never be used for actual patient care or medical decision-making.
+- **Small training set.** 511 cases is a modest dataset for fine-tuning. The model may not generalize well to diseases or presentations not represented in the training data.
+- **Small base model.** Llama 3.2 3B is a relatively small model. Larger models would likely produce better reasoning.
+- **Biases.** The training data comes from a single institution (MIMIC-IV / Beth Israel Deaconess Medical Center), so the model may reflect that institution's patient population and clinical practices.
+- **Hallucination risk.** Like all language models, this model can generate plausible-sounding but incorrect medical reasoning.
+## Citation
+If you use this model, please cite the DiReCT dataset:
+```bibtex
+@article{wang2024direct,
+  title={DiReCT: Diagnostic Reasoning for Clinical Notes via Large Language Models},
+  author={Wang, Bowen and Chang, Jiuyang and Qian, Yiming and others},
+  journal={arXiv preprint arXiv:2408.01933},
+  year={2024}
+}
+```
 ```bibtex
+@article{PhysioNet-mimic-iv-ext-direct-1.0.0,
+  author = {Wang, Bowen and Chang, Jiuyang and Qian, Yiming},
+  title = {{MIMIC-IV-Ext-DiReCT}},
+  journal = {{PhysioNet}},
+  year = {2025},
+  doi = {10.13026/yf96-kc87}
 }
+```
+## Contact
+This model was created as a learning exercise in fine-tuning language models for medical education applications.