DeepSeek-R1-Medical-COT
DeepSeek-R1-Medical-COT is a 4-bit fine-tuned language model optimized for medical reasoning and clinical scenario interpretation.
It is based on unsloth/DeepSeek-R1-Distill-Llama-8B and fine-tuned on the FreedomIntelligence/medical-o1-reasoning-SFT dataset to provide structured, step-by-step clinical reasoning and evidence-based conclusions.
Model Details
- Developed by: Mohamed Adel
- Model type: Causal Language Model (LLM)
- Language: English
- License: Apache-2.0
- Base model: unsloth/DeepSeek-R1-Distill-Llama-8B
- Finetuned for: Medical instruction-following and clinical reasoning tasks
Model Sources
- Repository: Hugging Face Model Hub
- Dataset for fine-tuning: FreedomIntelligence/medical-o1-reasoning-SFT
Uses
Direct Use
- Answer medical questions with step-by-step reasoning
- Predict clinical outcomes from scenarios
- Assist healthcare professionals in education or training
Downstream Use
- Integrate into medical decision-support tools
- Knowledge-grounded chatbots for clinical education
- Further fine-tuning for specialized medical domains
Out-of-Scope Use
- Real-time diagnosis for patients without supervision
- Legal or financial medical advice
- Non-medical tasks
How to Use
from unsloth import FastLanguageModel
from transformers import AutoTokenizer
model_name = "DeepSeek-R1-Medical-COT"
# Load model
model, tokenizer = FastLanguageModel.from_pretrained(model_name, load_in_4bit=True)
# Example inference
prompt = """
### Clinical Scenario:
A 54-year-old man complains of frequent urinary urgency, nocturia, and a weak urinary stream. His prostate is moderately enlarged. Predict likely cystometric findings.
"""
inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
outputs = model.generate(
input_ids=inputs.input_ids,
attention_mask=inputs.attention_mask,
max_new_tokens=500
)
print(tokenizer.decode(outputs[0]))
Training Details
- Dataset: FreedomIntelligence/medical-o1-reasoning-SFT
- Preprocessing: Prompts formatted with CoT style (
<think>...</think>) for step-by-step reasoning - Fine-Tuning Method: LoRA applied to attention and feedforward modules
- Hyperparameters:
- Batch size: 1 (gradient accumulation 8)
- Max steps: 200
- Learning rate: 2e-4
- Mixed precision: FP16 / BF16 depending on GPU support
- Optimizer: 8-bit AdamW
Evaluation
- Evaluated on a subset of medical reasoning questions
- Metrics: correctness of step-by-step reasoning, coherence, and final answer accuracy
- Results indicate improved structured reasoning over base model
Limitations and Risks
- Limited to the quality and scope of the training dataset
- May not cover rare or highly specialized medical cases
- Should not replace clinical judgment; intended for educational and reasoning support
Recommendation: Always review model outputs with a qualified healthcare professional.
Mohamed Adel (2026). DeepSeek-R1-Medical-COT. Retrieved from https://huggingface.co/DeepSeek-R1-Medical-COT
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for MohamedASAK/DeepSeek-R1-Medical-COT
Base model
deepseek-ai/DeepSeek-R1-Distill-Llama-8B Finetuned
unsloth/DeepSeek-R1-Distill-Llama-8B