Instructions to use Joesh1/onca-1.5-9B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Joesh1/onca-1.5-9B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Joesh1/onca-1.5-9B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Joesh1/onca-1.5-9B") model = AutoModelForCausalLM.from_pretrained("Joesh1/onca-1.5-9B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Joesh1/onca-1.5-9B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Joesh1/onca-1.5-9B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Joesh1/onca-1.5-9B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Joesh1/onca-1.5-9B
- SGLang
How to use Joesh1/onca-1.5-9B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Joesh1/onca-1.5-9B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Joesh1/onca-1.5-9B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Joesh1/onca-1.5-9B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Joesh1/onca-1.5-9B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Joesh1/onca-1.5-9B with Docker Model Runner:
docker model run hf.co/Joesh1/onca-1.5-9B
ONCA 1.5
Open pancreatic cancer language model for four research-oriented workflows: trial screening, clinical reasoning, pathology extraction, and variant evidence interpretation.
ONCA 1.5 is the current continued-SFT release in the ONCA model line. It builds on Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled and follows the project direction established in ONCA 1.0: focus on pancreatic cancer workflows, open-data training assets, and practical structured prompting for clinical research use.
For ONCA 1.5, the training emphasis was to improve all four existing task families at once, with particular attention to parser-safe trial screening, preserved pathology extraction performance, concise clinical reasoning, and stronger variant evidence handling.
At a Glance
| Field | Value |
|---|---|
| Release | BF16 reference release |
| Base model | Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled |
| Architecture | Qwen3.5-class causal LM (Qwen3_5ForCausalLM) |
| Context window | 262,144 tokens |
| Training recipe | Continued SFT |
| Domain focus | Pancreatic cancer and oncology-adjacent research workflows |
What This Release Is Good At
- Criterion-aware pancreatic cancer trial screening with explicit eligibility framing.
- Concise oncology reasoning and question answering for research workflows.
- Structured pathology abstraction with field-oriented prompting.
- Variant evidence interpretation with oncology context and uncertainty signaling.
What It Is Specialized For
This model is specialized for pancreatic cancer and oncology-adjacent research workflows rather than broad general-purpose medical chat. It works best when the task is tightly scoped and the target output format is explicit, especially for:
- pancreatic cancer trial eligibility review
- pathology report abstraction into structured fields
- concise oncology reasoning for case discussion
- variant evidence interpretation with uncertainty signaling
Example: pancreatic cancer trial-screening workflow
prompt = """
Task: Pancreatic cancer trial screening.
Patient summary:
- 63-year-old with metastatic pancreatic ductal adenocarcinoma
- ECOG 1
- Prior gemcitabine plus nab-paclitaxel
- Bilirubin 0.9 mg/dL
- No active infection
Trial criteria:
- Histologically confirmed metastatic pancreatic adenocarcinoma
- ECOG 0-1
- Progression after 1 prior systemic regimen
- Adequate marrow and hepatic function
- Exclude uncontrolled infection
Return:
1. Eligibility label
2. Criterion-by-criterion reasoning
3. Missing information
"""
messages = [
{"role": "system", "content": "You are Onca, a pancreatic cancer clinical research assistant."},
{"role": "user", "content": prompt},
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=False,
temperature=0.2,
)
answer = outputs[0][inputs["input_ids"].shape[1]:]
print(tokenizer.decode(answer, skip_special_tokens=True))
Release Note
This is the main reference checkpoint in the ONCA 1.5 family and the best starting point if you want the least altered release.
Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "Joesh1/onca-1.5"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype="auto",
device_map="auto",
trust_remote_code=True,
)
Use the included tokenizer and chat template for best results:
messages = [
{"role": "system", "content": "You are Onca, a pancreatic cancer clinical research assistant."},
{"role": "user", "content": "Extract tumor grade, margin status, pT, and pN from this pathology report as JSON."},
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=False,
temperature=0.2,
)
answer = outputs[0][inputs["input_ids"].shape[1]:]
print(tokenizer.decode(answer, skip_special_tokens=True))
Prompting Tips
- Use the included chat template and ask for a specific output structure.
- For extraction tasks, request exact JSON keys or field names.
- For screening tasks, include both the patient summary and the trial criteria.
- Ask the model to state uncertainty and missing information explicitly.
Training Scope
The active ONCA 1.5 continued-SFT stack uses openly available data only. The data mixture keeps all four task families in play, with weighting geared toward variant evidence repair and trial-screening stability while guarding against pathology regression.
| Task family | Active rows | Train | Val | Test |
|---|---|---|---|---|
| Trial Screening | 12,137 | 10,921 | 608 | 608 |
| Clinical Reasoning | 3,496 | 3,146 | 174 | 176 |
| Pathology Extraction | 7,642 | 6,583 | 414 | 405 |
| Variant Evidence | 2,432 | 2,191 | 116 | 125 |
| Total | 25,707 | 22,841 | 1,312 | 1,314 |
Initial task weights in the active prepare stack are 27% trial screening, 18% clinical reasoning, 27% pathology extraction, and 28% variant evidence.
Benchmarks
Benchmark tables and comparative evaluation plots will be added later.
Repository Contents
model-*.safetensors: sharded weights for this release.model.safetensors.index.json: shard map for the checkpoint files.config.json: architecture config and, for quantized variants, quantization metadata.generation_config.json: default generation settings.tokenizer.jsonandtokenizer_config.json: tokenizer assets.chat_template.jinja: chat formatting template for inference.assets/onca-logo-horizontal.svg: ONCA family logo used at the top of the model card.
Related Releases
onca-1.5: BF16 reference release (this page).onca-1.5-8bit: 8-bit merged release.onca-1.5-4bit: 4-bit merged release.
Limitations and Safety
- This is a research model and not a clinical decision system.
- Outputs should be reviewed by qualified experts before any real-world use.
- The model is specialized for pancreatic cancer and oncology-adjacent workflows rather than broad general medicine.
- Variant evidence training includes broader oncology signal, but the intended framing of the model remains pancreatic cancer research support.
- Quantized releases are convenience variants and may behave slightly differently from the BF16 reference checkpoint.
Citation
A formal ONCA 1.5 citation block will be added with the accompanying manuscript. Until then, please cite the model repository and version used in your work.
Acknowledgments
ONCA 1.5 continues the ONCA project lineage from ONCA 1.0 and builds on the Qwen/Qwopus ecosystem plus the open-data contributors whose datasets made this release possible.
- Downloads last month
- 9
Model tree for Joesh1/onca-1.5-9B
Base model
Qwen/Qwen3.5-9B-Base