WebOS SDFT 2B
Full-parameter SDFT (Self-Distillation Fine-Tuning, paper-faithful Shenfeld et al. 2601.19897 Algorithm 1) of Qwen3.5-2B on the WebOS GUI-grounding dataset.
Layout
| Subfolder | Description | Click-acc on full test set |
|---|---|---|
checkpoint-200/ |
Step 200 / 616 (32% of training) | 18.41% |
checkpoint-400/ |
Step 400 / 616 (65%) | 18.75% |
checkpoint-600/ |
Step 600 / 616 (97%) | 19.95% |
final/ |
Step 616 / 616 (100%) | ~20% |
For comparison: base Qwen3.5-2B on the same eval is 18.84%, and Qwen3.5-2B + privileged GT-demo teacher prompt is 26.46%.
Training config (16b_train_sdft_full_ft.py)
- bf16, full-param FT (vision frozen)
- bs=1, ga=32 (effective batch 32)
- lr=5e-6, warmup=10 steps, cosine
- AdamW, wd=0, max_grad_norm=1.0
- ema-α=0.01, kl-temperature=2.0, reverse KL
- 2 epochs × 9,864 train samples = 616 optimizer steps
- on-policy max_new_tokens=96, temperature=1.0
Loading
from transformers import AutoModelForImageTextToText, AutoProcessor
model = AutoModelForImageTextToText.from_pretrained(
"Chengheng/webos-sdft-2b", subfolder="checkpoint-600",
torch_dtype="bfloat16", device_map="cuda:0",
)
processor = AutoProcessor.from_pretrained(
"Chengheng/webos-sdft-2b", subfolder="checkpoint-600",
)
Code
Training and eval scripts: https://github.com/ChenghengLi/WebOS
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support