DeepSeek-R1-Medical-COT

DeepSeek-R1-Medical-COT is a 4-bit fine-tuned language model optimized for medical reasoning and clinical scenario interpretation.
It is based on unsloth/DeepSeek-R1-Distill-Llama-8B and fine-tuned on the FreedomIntelligence/medical-o1-reasoning-SFT dataset to provide structured, step-by-step clinical reasoning and evidence-based conclusions.

Model Details

Developed by: Mohamed Adel
Model type: Causal Language Model (LLM)
Language: English
License: Apache-2.0
Base model: unsloth/DeepSeek-R1-Distill-Llama-8B
Finetuned for: Medical instruction-following and clinical reasoning tasks

Model Sources

Repository: Hugging Face Model Hub
Dataset for fine-tuning: FreedomIntelligence/medical-o1-reasoning-SFT

Uses

Direct Use

Answer medical questions with step-by-step reasoning
Predict clinical outcomes from scenarios
Assist healthcare professionals in education or training

Downstream Use

Integrate into medical decision-support tools
Knowledge-grounded chatbots for clinical education
Further fine-tuning for specialized medical domains

Out-of-Scope Use

Real-time diagnosis for patients without supervision
Legal or financial medical advice
Non-medical tasks

How to Use

from unsloth import FastLanguageModel
from transformers import AutoTokenizer

model_name = "DeepSeek-R1-Medical-COT"

# Load model
model, tokenizer = FastLanguageModel.from_pretrained(model_name, load_in_4bit=True)

# Example inference
prompt = """
### Clinical Scenario:
A 54-year-old man complains of frequent urinary urgency, nocturia, and a weak urinary stream. His prostate is moderately enlarged. Predict likely cystometric findings.
"""
inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=500
)
print(tokenizer.decode(outputs[0]))

Training Details

Dataset: FreedomIntelligence/medical-o1-reasoning-SFT
Preprocessing: Prompts formatted with CoT style (<think>...</think>) for step-by-step reasoning
Fine-Tuning Method: LoRA applied to attention and feedforward modules
Hyperparameters:
- Batch size: 1 (gradient accumulation 8)
- Max steps: 200
- Learning rate: 2e-4
- Mixed precision: FP16 / BF16 depending on GPU support
- Optimizer: 8-bit AdamW

Evaluation

Evaluated on a subset of medical reasoning questions
Metrics: correctness of step-by-step reasoning, coherence, and final answer accuracy
Results indicate improved structured reasoning over base model

Limitations and Risks

Limited to the quality and scope of the training dataset
May not cover rare or highly specialized medical cases
Should not replace clinical judgment; intended for educational and reasoning support

Recommendation: Always review model outputs with a qualified healthcare professional.

Mohamed Adel (2026). DeepSeek-R1-Medical-COT. Retrieved from https://huggingface.co/DeepSeek-R1-Medical-COT

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MohamedASAK/DeepSeek-R1-Medical-COT

Base model

deepseek-ai/DeepSeek-R1-Distill-Llama-8B

Finetuned

unsloth/DeepSeek-R1-Distill-Llama-8B