README.md · ploppy2/Clinical-Reasoning-Test1 at main

Clinical-Reasoning-Test1 / README.md

ploppy2

Update README.md

6cab3fb verified 2 days ago

preview code

raw

history blame contribute delete

4.99 kB

	---
	license: llama3.2
	base_model: meta-llama/Llama-3.2-3B-Instruct
	tags:
	- medical
	- clinical-reasoning
	- diagnostic
	- education
	- fine-tuned
	- lora
	- sft
	- trl
	datasets:
	- mimic-iv-ext-direct
	language:
	- en
	pipeline_tag: text-generation
	---

	# Clinical Reasoning Model (Test 1)

	A fine-tuned version of [Llama 3.2 3B Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) trained to produce step-by-step diagnostic reasoning chains from clinical patient cases.

	## Purpose

	This model was created for educational purposes only. It is designed to demonstrate how a language model can walk through the clinical reasoning process, connecting patient findings (history, physical exam, labs, imaging) to a final diagnosis in a structured, step-by-step format.

	This model is NOT intended for clinical use, patient care, or medical decision-making.

	## What It Does

	Given a patient case (chief complaint, history, exam findings, labs, and imaging), the model produces:

	1. A final diagnosis
	2. A numbered reasoning chain that explains how each piece of clinical evidence supports or leads to that diagnosis

	### Example

	Input:
	> A patient presents with productive cough, fatigue, and chest congestion. History of prior TB treatment. Chest CT shows a thin-walled cavity in the right lower lobe with adjacent calcified granulomas and bronchiectasis.

	Output:
	> FINAL DIAGNOSIS: Tuberculosis
	>
	> Step 1: Cavities in the lungs are common in active tuberculosis, especially when the walls of the cavities are thin, indicating the possibility of active disease or reactivation of infection.
	> Supporting evidence: Superior segment right lower lobe relatively thin-walled cavity
	>
	> Step 2: The patient had been treated for tuberculosis several years earlier, which is important background information because tuberculosis can recur.
	> Supporting evidence: TB treated years ago
	>
	> Step 3: In patients with a history of tuberculosis, these symptoms may indicate activity or recurrence of tuberculosis.
	> Supporting evidence: symptoms of URI including fatigue, productive cough, runny nose, and chest congestion

	## Training Details

	### Dataset

	Trained on the [DiReCT (Diagnostic Reasoning for Clinical Notes)](https://physionet.org/content/mimic-iv-ext-direct/1.0.0/) dataset, which contains 511 clinical notes sourced from MIMIC-IV. Each note was annotated by physicians with structured diagnostic reasoning trees mapping clinical observations to final diagnoses.

	The dataset covers 25 disease categories and 73 unique diagnoses, including:

	- Acute Coronary Syndrome (NSTEMI, Unstable Angina)
	- Heart Failure (HFrEF, HFpEF)
	- Stroke (Hemorrhagic, Ischemic)
	- Pulmonary Embolism
	- Pneumonia
	- COPD
	- Multiple Sclerosis
	- Tuberculosis
	- Hypertension
	- And many more

	### Training Configuration

	\| Parameter \| Value \|
	\|---\|---\|
	\| Base model \| meta-llama/Llama-3.2-3B-Instruct \|
	\| Method \| SFT with LoRA (PEFT) \|
	\| Quantization \| 4-bit (NF4) \|
	\| LoRA rank \| 16 \|
	\| LoRA alpha \| 32 \|
	\| LoRA dropout \| 0.05 \|
	\| Learning rate \| 3e-5 \|
	\| Epochs \| 3 \|
	\| Batch size \| 1 (effective 8 with gradient accumulation) \|
	\| Precision \| FP16 \|
	\| Hardware \| NVIDIA T4 (Google Colab) \|

	### Training Results

	The model trained for 3 epochs with a steady decrease in loss:

	\| Step \| Training Loss \|
	\|---\|---\|
	\| 10 \| 22.38 \|
	\| 30 \| 19.23 \|
	\| 50 \| 17.03 \|
	\| 70 \| 15.23 \|
	\| 90 \| 15.08 \|
	\| 110 \| 15.07 \|
	\| 130 \| 14.57 \|
	\| 150 \| 13.90 \|
	\| 170 \| 14.35 \|
	\| 180 \| 13.71 \|

	## Limitations

	- Not for clinical use. This model is an educational experiment and should never be used for actual patient care or medical decision-making.
	- Small training set. 511 cases is a modest dataset for fine-tuning. The model may not generalize well to diseases or presentations not represented in the training data.
	- Small base model. Llama 3.2 3B is a relatively small model. Larger models would likely produce better reasoning.
	- Biases. The training data comes from a single institution (MIMIC-IV / Beth Israel Deaconess Medical Center), so the model may reflect that institution's patient population and clinical practices.
	- Hallucination risk. Like all language models, this model can generate plausible-sounding but incorrect medical reasoning.

	## Citation

	If you use this model, please cite the DiReCT dataset:

	```bibtex
	@article{wang2024direct,
	title={DiReCT: Diagnostic Reasoning for Clinical Notes via Large Language Models},
	author={Wang, Bowen and Chang, Jiuyang and Qian, Yiming and others},
	journal={arXiv preprint arXiv:2408.01933},
	year={2024}
	}
	```

	```bibtex
	@article{PhysioNet-mimic-iv-ext-direct-1.0.0,
	author = {Wang, Bowen and Chang, Jiuyang and Qian, Yiming},
	title = {{MIMIC-IV-Ext-DiReCT}},
	journal = {{PhysioNet}},
	year = {2025},
	doi = {10.13026/yf96-kc87}
	}
	```

	## Contact
	This model was created as a learning exercise in fine-tuning language models for medical education applications.
	Created by Arman Yalcin
	www.linkedin.com/in/arman8514581