Instructions to use Zero21/OncoScope with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Zero21/OncoScope with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Zero21/OncoScope") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Zero21/OncoScope", dtype="auto") - llama-cpp-python
How to use Zero21/OncoScope with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Zero21/OncoScope", filename="oncoscope-gemma-3n-merged.Q8_0.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use Zero21/OncoScope with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Zero21/OncoScope:Q8_0 # Run inference directly in the terminal: llama-cli -hf Zero21/OncoScope:Q8_0
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Zero21/OncoScope:Q8_0 # Run inference directly in the terminal: llama-cli -hf Zero21/OncoScope:Q8_0
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Zero21/OncoScope:Q8_0 # Run inference directly in the terminal: ./llama-cli -hf Zero21/OncoScope:Q8_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Zero21/OncoScope:Q8_0 # Run inference directly in the terminal: ./build/bin/llama-cli -hf Zero21/OncoScope:Q8_0
Use Docker
docker model run hf.co/Zero21/OncoScope:Q8_0
- LM Studio
- Jan
- vLLM
How to use Zero21/OncoScope with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Zero21/OncoScope" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Zero21/OncoScope", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Zero21/OncoScope:Q8_0
- SGLang
How to use Zero21/OncoScope with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Zero21/OncoScope" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Zero21/OncoScope", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Zero21/OncoScope" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Zero21/OncoScope", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use Zero21/OncoScope with Ollama:
ollama run hf.co/Zero21/OncoScope:Q8_0
- Unsloth Studio new
How to use Zero21/OncoScope with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Zero21/OncoScope to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Zero21/OncoScope to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Zero21/OncoScope to start chatting
- Docker Model Runner
How to use Zero21/OncoScope with Docker Model Runner:
docker model run hf.co/Zero21/OncoScope:Q8_0
- Lemonade
How to use Zero21/OncoScope with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Zero21/OncoScope:Q8_0
Run and chat with the model
lemonade run user.OncoScope-Q8_0
List all available models
lemonade list
OncoScope Cancer Genomics Analysis Model
OncoScope is a specialized AI model fine-tuned for cancer genomics analysis and precision oncology. Built on Google's Gemma 3n architecture, this model provides expert-level analysis of cancer mutations, risk assessments, and therapeutic recommendations while maintaining complete privacy through on-device inference.
Model Details
- Base Model: Google Gemma 3n 2B E4B Chat IT
- Parameters: 6.9B (quantized from fine-tuned model)
- Architecture: Gemma3n
- Quantization: Q8_0 GGUF format
- Context Length: 32,768 tokens
- Embedding Length: 2,048
Key Features
- Cancer Mutation Analysis: Pathogenicity assessment using ACMG/AMP guidelines
- Risk Stratification: Hereditary cancer syndrome evaluation
- Therapeutic Recommendations: Evidence-based drug target identification
- Privacy-First: Designed for on-device inference with Ollama
- Clinical Guidelines: Incorporates established medical standards
- Multi-mutation Analysis: Complex genomic interaction assessment
Training Data
The model was fine-tuned on a curated dataset of 5,998 cancer genomics examples from:
- ClinVar: Clinical variant database
- COSMIC Top 50: Cancer mutation signatures
- Expert-curated: Clinical oncology cases
Usage
With Ollama
Download the model files:
oncoscope-gemma-3n-merged.Q8_0.gguf(6.8GB)Modelfile
Create the model:
ollama create oncoscope -f ModelfileRun inference:
ollama run oncoscope "Analyze the clinical significance of BRCA1 c.5266dupC mutation"
Example Usage
ollama run oncoscope "Patient: 45-year-old female with family history of breast cancer.
Mutation: BRCA1 c.68_69delAG (p.Glu23ValfsTer17).
Please provide pathogenicity assessment and recommendations."
Example Response:
{
"mutation_analysis": {
"gene": "BRCA1",
"variant": "c.68_69delAG",
"protein_change": "p.Glu23ValfsTer17",
"pathogenicity": "Pathogenic",
"confidence_score": 0.95,
"acmg_classification": "PVS1, PM2, PP3"
},
"clinical_significance": {
"cancer_risk": "High",
"associated_cancers": ["Breast", "Ovarian"],
"lifetime_risk": {
"breast_cancer": "55-85%",
"ovarian_cancer": "15-40%"
}
},
"recommendations": {
"genetic_counseling": "Strongly recommended",
"screening": "Enhanced surveillance starting age 25",
"prevention": "Consider prophylactic surgery",
"family_testing": "Cascade testing recommended"
}
}
Model Capabilities
- Pathogenicity Assessment: ACMG/AMP guideline compliance
- Risk Calculation: Quantitative cancer risk estimates
- Drug Recommendations: FDA-approved targeted therapies
- Family History Analysis: Hereditary pattern recognition
- Genetic Counseling: Evidence-based guidance
- Multi-lingual Support: Medical terminology in multiple languages
Limitations
- Medical Disclaimer: This model is for research and educational purposes only. Always consult qualified healthcare professionals for medical decisions.
- Training Cutoff: Knowledge based on training data through early 2024
- Quantization: Some precision loss due to Q8_0 quantization
- Context Window: Limited to 4,096 tokens for optimal performance
Technical Specifications
- Model Size: 6.8GB (GGUF Q8_0)
- Memory Requirements: 8GB+ RAM recommended
- Hardware: CPU inference optimized, GPU acceleration supported
- Operating Systems: Cross-platform (macOS, Linux, Windows)
Performance
The model demonstrates expert-level performance on:
- Variant pathogenicity classification (>90% accuracy vs. clinical consensus)
- Cancer risk assessment correlation with established guidelines
- Therapeutic recommendation alignment with FDA approvals
- Response time: 20-40 seconds for complex genomic analysis
Privacy & Security
- On-Device Inference: No data transmitted to external servers
- HIPAA Compliance: Suitable for clinical environments
- Offline Operation: Full functionality without internet connection
- Data Security: Patient genetic information remains local
Citation
If you use this model in your research, please cite:
@misc{oncoscope2025,
title={OncoScope: Privacy-First Cancer Genomics Analysis with Gemma 3n},
author={Sheldon Aristide},
year={2025},
url={https://huggingface.co/Zero21/OncoScope}
}
License
This model is released under the Apache 2.0 license, consistent with the base Gemma model licensing.
Support & Contact
For questions, issues, or contributions:
- GitHub: OncoScope Project
- Issues: Please report bugs or feature requests via GitHub Issues
Disclaimer
This AI model is intended for research and educational purposes only. It should not be used as a substitute for professional medical advice, diagnosis, or treatment. Always seek the advice of qualified healthcare professionals regarding any medical condition or genetic testing decisions.
- Downloads last month
- 3
8-bit
Model tree for Zero21/OncoScope
Base model
google/gemma-3n-E4B