OncoCareBrain-GPT

Model Description

OncoCareBrain-GPT is a specialized large language model fine-tuned for oncology applications. Built upon the powerful Qwen2.5-3B foundation model, it has undergone supervised fine-tuning (SFT) with tens of thousands of multi-omics data samples, including genomic, pathological, and clinical data. This model is specifically designed to serve the cancer care domain with advanced reasoning capabilities.

Key Features

  • Intelligent Medical Q&A: Quickly answers complex questions about cancer, leveraging a deep understanding of oncology concepts
  • Precision Decision Support: Recommends optimal treatment plans based on multi-dimensional data analysis
  • Transparent Reasoning Process: Generates detailed chains of thought to ensure model explainability and trust in clinical settings

Intended Uses

  • Clinical Decision Support: Assists healthcare providers in evaluating treatment options
  • Patient Education: Helps patients better understand their condition and treatment plans
  • Medical Research: Supports researchers in analyzing cancer data and generating insights

Training Data

OncoCareBrain-GPT was fine-tuned on a diverse dataset comprising:

  • Genomic data
  • Pathological samples
  • Clinical records and case studies

The model was trained to generate detailed reasoning chains, provide personalized prognostic assessments, and suggest evidence-based treatment recommendations.

Technical Specifications

  • Base Model: Qwen2.5-3B
  • Parameters: 3 billion
  • Training Method: Supervised Fine-Tuning (SFT)
  • Language Capabilities: English, Chinese
  • Input Format: Natural language
  • Output Format: Detailed explanations with chain-of-thought reasoning

Limitations

  • The model should be used as a clinical decision support tool and not as a replacement for professional medical judgment
  • Recommendations should be verified by qualified healthcare professionals
  • Performance may vary depending on the complexity and rarity of cancer cases
  • While the model supports English and Chinese, performance might vary between languages

Ethical Considerations

  • Privacy: The model operates on input data and does not store patient information
  • Bias: While efforts have been made to minimize biases, users should be aware of potential biases in training data
  • Transparency: The model provides reasoning chains to ensure transparency in its decision-making process

How to Use

# Example code for model inference
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("DXCLab/OncoCareBrain-GPT")
model = AutoModelForCausalLM.from_pretrained("DXCLab/OncoCareBrain-GPT")

input_text = "Could you analyze this genomic profile and suggest potential treatment options for breast cancer with BRCA1 mutation?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=1000)
response = tokenizer.decode(outputs[0])
print(response)

Citation

If you use OncoCareBrain-GPT in your research, please cite:

@misc{OncoCareBrain-GPT,
  author = {DXCLab},
  title = {OncoCareBrain-GPT: A Specialized Language Model for Oncology},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/DXCLab/OncoCareBrain-GPT}}
}

License

This model is licensed under the Apache License 2.0. See the LICENSE file for details.

Contact

For questions or feedback about OncoCareBrain-GPT, please visit our Hugging Face page at https://huggingface.co/DXCLab or open an issue in the repository.

Downloads last month
75
Safetensors
Model size
3B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for DXCLab/OncoCareBrain-GPT

Base model

Qwen/Qwen2.5-3B
Finetuned
(1098)
this model
Quantizations
1 model

Space using DXCLab/OncoCareBrain-GPT 1