Spec-o3-3B
Model Summary
Spec-o3-3B is a tool-augmented vision-language agent for astronomer-aligned spectral inspection and candidate vetting. It is designed to follow an interactive workflow that resembles expert review: inspect an initial full-spectrum view, iteratively request localized wavelength-window re-visualizations via a lightweight spectral visualization tool, and update hypotheses through interleaved multimodal reasoning before producing a final decision.
- Backbone: Qwen2.5-VL-3B-Instruct
- Inputs: Text + spectrum visualizations (global view plus tool-rendered zoomed views)
- Core capability: Multi-turn inspection with tool calls for evidence localization
Intended Use
- Primary: Human-in-the-loop assistance for spectral inspection and rare-object candidate vetting, providing (1) a final judgment and (2) an auditable inspection trace grounded in localized spectral evidence.
- Suitable settings: Research, offline analysis, and expert triage pipelines.
- Not intended for: Fully automated catalog publication without expert verification, safety-critical decision-making, or uses outside spectroscopic analysis.
How It Works (Tool-Augmented Inspection)
Spec-o3 alternates between:
- Reasoning about what spectral evidence is needed, and
- Tool calls that request re-rendered views for specific wavelength intervals.
A typical tool call uses JSON arguments like:
{"label": "Zoom on Hα region", "wavelength_range": [6500, 6600]}
Recommended Prompt / Output Format
To reproduce the intended behavior, use a structured format:
- A clear task instruction (e.g., “inspect the spectrum and decide whether the candidate should be accepted”).
- Allow tool calls during inference (one per turn).
- Expect a final decision with a short justification.
If you prefer not to expose verbose reasoning traces in production, you can post-process outputs to retain only the final answer and brief evidence summary.
Training Overview
Spec-o3-3B uses a two-stage post-training recipe:
- Cold-start SFT: supervised fine-tuning on ~1k expert-approved spectral inspection trajectories with tool usage.
- Outcome-based RL (GRPO): reinforcement learning on label-only inspection tasks to improve decision quality, stabilize tool usage, and strengthen evidence localization.
High-level notes:
- Tool-rendered outputs are loss-masked to discourage memorization of images.
- RL uses group-wise rollouts (e.g., 8 rollouts) and an outcome reward emphasizing correctness and format compliance.
Evaluation (Reported)
- SpecVI-Bench (macro-average F1 across inspection tasks): 73.3
- Cross-Survey (transfer to SDSS/DESI matched spectra, average F1): SDSS 77.3, DESI 73.6 (reference on LAMOST subset: 79.8)
- Cross-Task (transfer to unseen inspection categories on LAMOST, average F1): 74.4
Limitations
- The released checkpoints are evaluated on a limited set of inspection tasks and do not cover all astrophysical classes or all observational conditions encountered in production pipelines.
- Real-world vetting often requires external cross-matching and additional modalities (photometry, imaging, time-domain evidence) beyond spectrum-only inspection.
- Extending to new surveys or new target categories may still require expert demonstration data for cold start and careful validation.
- The model does not yet provide production-grade uncertainty handling (e.g., abstention, calibration, or risk-aware triage) out of the box.
Usage (Conceptual)
A typical integration loop:
- Render an initial full-range spectrum image.
- Run the model with the task prompt + image.
- If a tool call is emitted, render the requested wavelength window and feed the new image back.
- Repeat until a final decision is produced or a tool-call budget is reached.
Citation
@misc{Jia2026SpecO3,
author = {Minghui Jia and Qichao Zhang and Ali Luo and Linjing Li and Shuo Ye and Hailing Lu and Wen Hou and Dongbin Zhao},
title = {Spec-o3: A Tool-Augmented Vision-Language Agent for Rare Celestial Object Candidate Vetting via Automated Spectral Inspection},
eprint = {2601.06498},
archivePrefix= {arXiv},
primaryClass = {cs.CL},
year = {2026},
url = {https://arxiv.org/abs/2601.06498},
doi = {10.48550/arXiv.2601.06498}
}
- Downloads last month
- 15