LlamaTron RS1 ThinkDoc

Model Description

LlamaTron RS1 ThinkDoc is a specialized medical AI assistant developed through supervised fine-tuning of Meta's Llama 3.2 1B Instruct model on 112,165 real doctor-patient conversations. This model represents the first iteration in a progressive fine-tuning series (Research Series 1), designed to provide medically-informed responses in clinical consultation contexts.

The model has been fine-tuned using Low-Rank Adaptation (LoRA) on the ChatDoctor-HealthCareMagic-100k dataset, enabling it to generate human-aligned responses that mirror the consultation patterns of practicing physicians while maintaining professional and empathetic communication.

Intended Use

Primary Use Cases

  • Research and educational purposes in medical AI
  • Demonstration of fine-tuning techniques for domain-specific applications
  • Study of language model adaptation to medical conversation patterns
  • Development and testing of medical chatbot interfaces

Out-of-Scope Use

This model is not intended for:

  • Providing actual medical diagnosis or treatment
  • Replacing professional medical consultation
  • Emergency medical situations
  • Making clinical decisions without physician oversight

Important: All medical advice generated by this model should be considered informational only and must not substitute consultation with qualified healthcare providers.

Model Details

Base Model Information

  • Architecture: Llama 3.2 1B Instruct
  • Model ID: meta-llama/Llama-3.2-1B-Instruct
  • Total Parameters: 1.24 billion
  • Developer: Meta AI

Fine-Tuning Specifications

  • Method: LoRA (Low-Rank Adaptation)
  • LoRA Rank: 16
  • LoRA Alpha: 32
  • Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Trainable Parameters: 11.2M (0.90% of total parameters)
  • Dropout Rate: 0.05

Training Dataset

  • Name: ChatDoctor-HealthCareMagic-100k
  • Source: lavita/ChatDoctor-HealthCareMagic-100k
  • Total Samples: 112,165 medical conversations
  • Training Split: 106,548 samples (95%)
  • Evaluation Split: 5,608 samples (5%)
  • Format: Instruction-input-output format containing real doctor-patient interactions
  • Domain Coverage: General medical consultation across various medical specialties

Training Configuration

Hardware:

  • GPU: NVIDIA H200
  • Training Duration: Approximately 3 hours

Hyperparameters:

  • Epochs: 2
  • Per-Device Batch Size: 4
  • Gradient Accumulation Steps: 4
  • Effective Batch Size: 16
  • Learning Rate: 3e-4
  • Learning Rate Schedule: Cosine with 3% warmup
  • Optimizer: Paged AdamW 8-bit
  • Max Sequence Length: 1024 tokens
  • Precision: BFloat16

Technical Stack:

  • Framework: Hugging Face Transformers
  • Fine-tuning Library: PEFT (Parameter-Efficient Fine-Tuning)
  • Training Library: Hugging Face Trainer
  • Model Format: SafeTensors

Model Files

This repository contains the complete merged model ready for inference. The following files are included:

  • model.safetensors - Model weights in SafeTensors format
  • config.json - Model configuration
  • generation_config.json - Generation parameters configuration
  • tokenizer.json - Tokenizer vocabulary and settings
  • tokenizer_config.json - Tokenizer configuration
  • chat_template.jinja - Chat template for formatting conversations

All files have been uploaded and are available for direct download or automatic loading via the Transformers library.

Usage

Installation

pip install transformers torch accelerate peft

Loading the Model

All model files have been uploaded to Hugging Face Hub. You can load the model directly without any manual downloads:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load directly from Hugging Face Hub
model_id = "sufirumii/LlamaTron-RS1-ThinkDoc"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

The model will be automatically downloaded and cached on your system. For offline usage, you can also manually download all files from this repository and load from a local path:

# Load from local directory (if you've manually downloaded the files)
local_path = "./LlamaTron-RS1-ThinkDoc"

tokenizer = AutoTokenizer.from_pretrained(local_path)
model = AutoModelForCausalLM.from_pretrained(
    local_path,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

Basic Inference

messages = [
    {
        "role": "system", 
        "content": "If you are a doctor, please answer the medical questions based on the patient's description."
    },
    {
        "role": "user", 
        "content": "I have a severe headache and fever for 3 days. What should I do?"
    }
]

# Apply chat template
text = tokenizer.apply_chat_template(
    messages, 
    add_generation_prompt=True, 
    tokenize=False
)

# Tokenize input
input_ids = tokenizer(text, return_tensors="pt").input_ids.to(model.device)

# Generate response
with torch.no_grad():
    output = model.generate(
        input_ids,
        max_new_tokens=400,
        temperature=0.7,
        top_p=0.9,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

# Decode and print response
response = tokenizer.decode(
    output[0][input_ids.shape[-1]:], 
    skip_special_tokens=True
)
print(response)

Advanced Inference Parameters

For different use cases, you can adjust generation parameters:

# More deterministic output
output = model.generate(
    input_ids,
    max_new_tokens=400,
    temperature=0.3,
    top_p=0.85,
    do_sample=True,
    repetition_penalty=1.1,
    pad_token_id=tokenizer.eos_token_id
)

# More creative output
output = model.generate(
    input_ids,
    max_new_tokens=500,
    temperature=0.9,
    top_p=0.95,
    top_k=50,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)

Model Capabilities

The model demonstrates the following capabilities:

  • Medical Question Answering: Generates responses to patient queries across various medical domains
  • Symptom Analysis: Provides structured assessment of reported symptoms
  • Treatment Recommendations: Offers general medical advice and potential treatment options
  • Professional Communication: Maintains empathetic and professional tone consistent with physician-patient interactions
  • Contextual Understanding: Processes patient descriptions and provides relevant medical information

Limitations and Biases

Technical Limitations

  • Training Scope: Limited to patterns observed in the ChatDoctor-HealthCareMagic-100k dataset
  • Sequence Length: Maximum context window of 1024 tokens may limit handling of very long medical histories
  • Modality: Text-only model without support for medical imaging or multimodal inputs
  • Hallucination Risk: May generate plausible-sounding but medically incorrect information
  • No Clinical Validation: Has not been evaluated against established clinical benchmarks or real-world deployment scenarios

Domain Limitations

  • Training data primarily consists of general medical consultations
  • May have limited knowledge of rare conditions or cutting-edge medical research
  • Cannot perform physical examinations or order diagnostic tests
  • Lacks integration with electronic health records or patient-specific data

Ethical Considerations

  • Not a Medical Device: This model is not certified as a medical device and should not be used for clinical decision-making
  • Liability: Users assume all responsibility for any decisions made based on model outputs
  • Data Privacy: Users must ensure patient data privacy when using this model
  • Bias: Training data may contain inherent biases present in historical doctor-patient interactions

Evaluation

Current evaluation metrics are limited to training and validation loss during fine-tuning. The model has not undergone:

  • Clinical accuracy assessment
  • Medical board examination benchmarks
  • Expert physician review
  • Real-world deployment testing
  • Safety and harm evaluation specific to medical contexts

Future iterations will incorporate comprehensive evaluation protocols.

Future Development

RS2 Roadmap

The next iteration (Research Series 2) will feature:

  • Integration of 800K+ chain-of-thought reasoning samples
  • Extended context length support
  • Enhanced multi-turn conversation capabilities
  • Improved reasoning transparency

Long-term Vision

  • Scaling to multi-million sample datasets
  • Comprehensive clinical benchmark evaluation
  • Multilingual medical consultation support
  • Specialized adaptations for medical sub-domains (cardiology, oncology, pediatrics, etc.)
  • Integration of medical knowledge graphs and structured clinical data

Citation

If you use this model in your research or applications, please cite:

@misc{llamatron-rs1-thinkdoc-2026,
  title={LlamaTron RS1 ThinkDoc: Fine-Tuned Medical Consultation Assistant},
  author={sufirumii},
  year={2026},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/sufirumii/LlamaTron-RS1-ThinkDoc}}
}

Acknowledgments

  • Meta AI for developing and releasing the Llama 3.2 base model
  • Lavita for curating and providing the ChatDoctor-HealthCareMagic-100k dataset
  • Hugging Face for the transformers and PEFT libraries that enabled efficient fine-tuning

License

This model is released under the Llama 3.2 Community License. Users must comply with Meta's Llama 3.2 license terms. Please refer to the official Llama 3.2 license for complete usage terms and restrictions.

The fine-tuned weights are provided as-is for research and educational purposes.

Model Card Contact

For questions, issues, or collaboration inquiries:


Version: RS1 (Research Series 1)
Status: Released
Last Updated: February 2026

Downloads last month
40
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Rumiii/LlamaTron-RS1-ThinkDoc

Adapter
(542)
this model
Merges
1 model

Dataset used to train Rumiii/LlamaTron-RS1-ThinkDoc