ploppy2 commited on
Commit
38aa79f
·
verified ·
1 Parent(s): 833a6d0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +117 -36
README.md CHANGED
@@ -1,58 +1,139 @@
1
  ---
 
2
  base_model: meta-llama/Llama-3.2-3B-Instruct
3
- library_name: transformers
4
- model_name: Clinical-Reasoning-Test1
5
  tags:
6
- - generated_from_trainer
7
- - trl
 
 
 
 
8
  - sft
9
- licence: license
 
 
 
 
 
10
  ---
11
 
12
- # Model Card for Clinical-Reasoning-Test1
13
 
14
- This model is a fine-tuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct).
15
- It has been trained using [TRL](https://github.com/huggingface/trl).
16
 
17
- ## Quick start
18
 
19
- ```python
20
- from transformers import pipeline
21
 
22
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
23
- generator = pipeline("text-generation", model="ploppy2/Clinical-Reasoning-Test1", device="cuda")
24
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
25
- print(output["generated_text"])
26
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
- ## Training procedure
29
 
30
-
31
 
 
 
 
 
 
 
 
 
 
 
32
 
33
- This model was trained with SFT.
34
 
35
- ### Framework versions
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
- - TRL: 0.13.0
38
- - Transformers: 4.48.0
39
- - Pytorch: 2.5.1+cu121
40
- - Datasets: 3.2.0
41
- - Tokenizers: 0.21.4
42
 
43
- ## Citations
44
 
 
 
 
 
 
 
 
 
 
 
 
 
45
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
- Cite TRL as:
48
-
49
  ```bibtex
50
- @misc{vonwerra2022trl,
51
- title = {{TRL: Transformer Reinforcement Learning}},
52
- author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
53
- year = 2020,
54
- journal = {GitHub repository},
55
- publisher = {GitHub},
56
- howpublished = {\url{https://github.com/huggingface/trl}}
57
  }
58
- ```
 
 
 
 
 
1
  ---
2
+ license: llama3.2
3
  base_model: meta-llama/Llama-3.2-3B-Instruct
 
 
4
  tags:
5
+ - medical
6
+ - clinical-reasoning
7
+ - diagnostic
8
+ - education
9
+ - fine-tuned
10
+ - lora
11
  - sft
12
+ - trl
13
+ datasets:
14
+ - mimic-iv-ext-direct
15
+ language:
16
+ - en
17
+ pipeline_tag: text-generation
18
  ---
19
 
20
+ # Clinical Reasoning Model (Test 1)
21
 
22
+ A fine-tuned version of [Llama 3.2 3B Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) trained to produce step-by-step diagnostic reasoning chains from clinical patient cases.
 
23
 
24
+ ## Purpose
25
 
26
+ This model was created for **educational purposes only**. It is designed to demonstrate how a language model can walk through the clinical reasoning process, connecting patient findings (history, physical exam, labs, imaging) to a final diagnosis in a structured, step-by-step format.
 
27
 
28
+ **This model is NOT intended for clinical use, patient care, or medical decision-making.**
29
+
30
+ ## What It Does
31
+
32
+ Given a patient case (chief complaint, history, exam findings, labs, and imaging), the model produces:
33
+
34
+ 1. A final diagnosis
35
+ 2. A numbered reasoning chain that explains how each piece of clinical evidence supports or leads to that diagnosis
36
+
37
+ ### Example
38
+
39
+ **Input:**
40
+ > A patient presents with productive cough, fatigue, and chest congestion. History of prior TB treatment. Chest CT shows a thin-walled cavity in the right lower lobe with adjacent calcified granulomas and bronchiectasis.
41
+
42
+ **Output:**
43
+ > FINAL DIAGNOSIS: Tuberculosis
44
+ >
45
+ > Step 1: Cavities in the lungs are common in active tuberculosis, especially when the walls of the cavities are thin, indicating the possibility of active disease or reactivation of infection.
46
+ > Supporting evidence: Superior segment right lower lobe relatively thin-walled cavity
47
+ >
48
+ > Step 2: The patient had been treated for tuberculosis several years earlier, which is important background information because tuberculosis can recur.
49
+ > Supporting evidence: TB treated years ago
50
+ >
51
+ > Step 3: In patients with a history of tuberculosis, these symptoms may indicate activity or recurrence of tuberculosis.
52
+ > Supporting evidence: symptoms of URI including fatigue, productive cough, runny nose, and chest congestion
53
+
54
+ ## Training Details
55
+
56
+ ### Dataset
57
 
58
+ Trained on the [DiReCT (Diagnostic Reasoning for Clinical Notes)](https://physionet.org/content/mimic-iv-ext-direct/1.0.0/) dataset, which contains 511 clinical notes sourced from MIMIC-IV. Each note was annotated by physicians with structured diagnostic reasoning trees mapping clinical observations to final diagnoses.
59
 
60
+ The dataset covers 25 disease categories and 73 unique diagnoses, including:
61
 
62
+ - Acute Coronary Syndrome (NSTEMI, Unstable Angina)
63
+ - Heart Failure (HFrEF, HFpEF)
64
+ - Stroke (Hemorrhagic, Ischemic)
65
+ - Pulmonary Embolism
66
+ - Pneumonia
67
+ - COPD
68
+ - Multiple Sclerosis
69
+ - Tuberculosis
70
+ - Hypertension
71
+ - And many more
72
 
73
+ ### Training Configuration
74
 
75
+ | Parameter | Value |
76
+ |---|---|
77
+ | Base model | meta-llama/Llama-3.2-3B-Instruct |
78
+ | Method | SFT with LoRA (PEFT) |
79
+ | Quantization | 4-bit (NF4) |
80
+ | LoRA rank | 16 |
81
+ | LoRA alpha | 32 |
82
+ | LoRA dropout | 0.05 |
83
+ | Learning rate | 3e-5 |
84
+ | Epochs | 3 |
85
+ | Batch size | 1 (effective 8 with gradient accumulation) |
86
+ | Precision | FP16 |
87
+ | Hardware | NVIDIA T4 (Google Colab) |
88
 
89
+ ### Training Results
 
 
 
 
90
 
91
+ The model trained for 3 epochs with a steady decrease in loss:
92
 
93
+ | Step | Training Loss |
94
+ |---|---|
95
+ | 10 | 22.38 |
96
+ | 30 | 19.23 |
97
+ | 50 | 17.03 |
98
+ | 70 | 15.23 |
99
+ | 90 | 15.08 |
100
+ | 110 | 15.07 |
101
+ | 130 | 14.57 |
102
+ | 150 | 13.90 |
103
+ | 170 | 14.35 |
104
+ | 180 | 13.71 |
105
 
106
+ ## Limitations
107
+
108
+ - **Not for clinical use.** This model is an educational experiment and should never be used for actual patient care or medical decision-making.
109
+ - **Small training set.** 511 cases is a modest dataset for fine-tuning. The model may not generalize well to diseases or presentations not represented in the training data.
110
+ - **Small base model.** Llama 3.2 3B is a relatively small model. Larger models would likely produce better reasoning.
111
+ - **Biases.** The training data comes from a single institution (MIMIC-IV / Beth Israel Deaconess Medical Center), so the model may reflect that institution's patient population and clinical practices.
112
+ - **Hallucination risk.** Like all language models, this model can generate plausible-sounding but incorrect medical reasoning.
113
+
114
+ ## Citation
115
+
116
+ If you use this model, please cite the DiReCT dataset:
117
+
118
+ ```bibtex
119
+ @article{wang2024direct,
120
+ title={DiReCT: Diagnostic Reasoning for Clinical Notes via Large Language Models},
121
+ author={Wang, Bowen and Chang, Jiuyang and Qian, Yiming and others},
122
+ journal={arXiv preprint arXiv:2408.01933},
123
+ year={2024}
124
+ }
125
+ ```
126
 
 
 
127
  ```bibtex
128
+ @article{PhysioNet-mimic-iv-ext-direct-1.0.0,
129
+ author = {Wang, Bowen and Chang, Jiuyang and Qian, Yiming},
130
+ title = {{MIMIC-IV-Ext-DiReCT}},
131
+ journal = {{PhysioNet}},
132
+ year = {2025},
133
+ doi = {10.13026/yf96-kc87}
 
134
  }
135
+ ```
136
+
137
+ ## Contact
138
+
139
+ This model was created as a learning exercise in fine-tuning language models for medical education applications.