--- license: apache-2.0 language: - en - zh base_model: - Qwen/Qwen2.5-3B-Instruct tags: - medical - cancer - Onco --- # OncoCareBrain-GPT ## Model Description OncoCareBrain-GPT is a specialized large language model fine-tuned for oncology applications. Built upon the powerful Qwen2.5-3B foundation model, it has undergone supervised fine-tuning (SFT) with tens of thousands of multi-omics data samples, including genomic, pathological, and clinical data. This model is specifically designed to serve the cancer care domain with advanced reasoning capabilities. ## Key Features - **Intelligent Medical Q&A**: Quickly answers complex questions about cancer, leveraging a deep understanding of oncology concepts - **Precision Decision Support**: Recommends optimal treatment plans based on multi-dimensional data analysis - **Transparent Reasoning Process**: Generates detailed chains of thought to ensure model explainability and trust in clinical settings ## Intended Uses - **Clinical Decision Support**: Assists healthcare providers in evaluating treatment options - **Patient Education**: Helps patients better understand their condition and treatment plans - **Medical Research**: Supports researchers in analyzing cancer data and generating insights ## Training Data OncoCareBrain-GPT was fine-tuned on a diverse dataset comprising: - Genomic data - Pathological samples - Clinical records and case studies The model was trained to generate detailed reasoning chains, provide personalized prognostic assessments, and suggest evidence-based treatment recommendations. ## Technical Specifications - **Base Model**: Qwen2.5-3B - **Parameters**: 3 billion - **Training Method**: Supervised Fine-Tuning (SFT) - **Language Capabilities**: English, Chinese - **Input Format**: Natural language - **Output Format**: Detailed explanations with chain-of-thought reasoning ## Limitations - The model should be used as a clinical decision support tool and not as a replacement for professional medical judgment - Recommendations should be verified by qualified healthcare professionals - Performance may vary depending on the complexity and rarity of cancer cases - While the model supports English and Chinese, performance might vary between languages ## Ethical Considerations - **Privacy**: The model operates on input data and does not store patient information - **Bias**: While efforts have been made to minimize biases, users should be aware of potential biases in training data - **Transparency**: The model provides reasoning chains to ensure transparency in its decision-making process ## How to Use ```python # Example code for model inference from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("DXCLab/OncoCareBrain-GPT") model = AutoModelForCausalLM.from_pretrained("DXCLab/OncoCareBrain-GPT") input_text = "Could you analyze this genomic profile and suggest potential treatment options for breast cancer with BRCA1 mutation?" inputs = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**inputs, max_length=1000) response = tokenizer.decode(outputs[0]) print(response) ``` ## Citation If you use OncoCareBrain-GPT in your research, please cite: ``` @misc{OncoCareBrain-GPT, author = {DXCLab}, title = {OncoCareBrain-GPT: A Specialized Language Model for Oncology}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/DXCLab/OncoCareBrain-GPT}} } ``` ## License This model is licensed under the Apache License 2.0. See the [LICENSE](LICENSE) file for details. ## Contact For questions or feedback about OncoCareBrain-GPT, please visit our Hugging Face page at https://huggingface.co/DXCLab or open an issue in the repository.