π©Ί GPT-Neo 125M Medical Reasoning LoRA
This model is a LoRA fine-tuned version of EleutherAI's GPT-Neo 125M for medical reasoning and clinical QA-style generation.
It was fine-tuned using parameter-efficient training (LoRA) on the OpenMed/Medical-Reasoning-SFT-Mega dataset.
πΉ Only adapter weights are trained (base model not fully fine-tuned) πΉ Optimized for instruction-style medical reasoning πΉ Lightweight & efficient to run
π Model Details
Base Model: EleutherAI/gpt-neo-125M
Architecture: Causal Language Model
Fine-Tuning Method: LoRA (PEFT)
Task Type: Medical reasoning / QA generation
Training Objective: Next-token prediction (causal LM)
π§ Training Setup Dataset
Name: OpenMed/Medical-Reasoning-SFT-Mega
Split: 95% train / 5% validation
Downsampled:
40,000 training samples
5,000 validation samples
Reformatted into structured chat format:
π§ Training Setup
Hyperparameters
| Parameter | Value |
|---|---|
| Epochs | 3 |
| Batch Size | 8 |
| Gradient Accumulation | 2 |
| Learning Rate | 2e-4 |
| Block Size | 256 |
| Weight Decay | 0.01 |
| FP16 | Enabled (if CUDA available) |
LoRA Configuration
| Parameter | Value |
|---|---|
| Rank (r) | 8 |
| Alpha | 16 |
| Dropout | 0.05 |
| Target Modules | q_proj, v_proj |
| Bias | None |
Only a small percentage of total parameters were trainable (~<1%), making training efficient.
π Evaluation
Evaluation was performed on a held-out validation set.
Metric: Cross-entropy loss
Reported:
Eval Loss: (auto-filled during training)
Perplexity: exp(eval_loss)
Perplexity was calculated as:
ppl = exp(eval_loss)
π Usage
Since this repo contains LoRA adapter weights, you must load it with the base model:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = "EleutherAI/gpt-neo-125M"
adapter = "ahmedrayan/medical_lora"
tokenizer = AutoTokenizer.from_pretrained(adapter)
model = AutoModelForCausalLM.from_pretrained(base_model)
model = PeftModel.from_pretrained(model, adapter)
prompt = "Common method by which bacteria can acquire new genetic material?"
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(output[0], skip_special_tokens=True))
π― Intended Use
This model is intended for:
Medical reasoning research
Educational experimentation
Fine-tuning demonstrations
PEFT / LoRA learning projects
β οΈ Not intended for real clinical decision-making.
β οΈ Limitations
Small base model (125M parameters)
Trained on subset (40k samples)
May hallucinate medical facts
No safety alignment beyond dataset supervision
Not evaluated against clinical benchmarks
π§ͺ Hardware
Device: CUDA (if available)
Mixed precision (FP16)
Trainer API from π€ Transformers
π License
Please refer to:
Base model license: EleutherAI/gpt-neo-125M
Dataset license: OpenMed/Medical-Reasoning-SFT-Mega
π Author
Ahmed Rayan AI Engineer | Medical AI Enthusiast GitHub / Hugging Face: ahmedrayan
Model tree for ahmedrayan/medical_lora
Base model
EleutherAI/gpt-neo-125m