WebOS SDFT 2B

Full-parameter SDFT (Self-Distillation Fine-Tuning, paper-faithful Shenfeld et al. 2601.19897 Algorithm 1) of Qwen3.5-2B on the WebOS GUI-grounding dataset.

Layout

Subfolder	Description	Click-acc on full test set
`checkpoint-200/`	Step 200 / 616 (32% of training)	18.41%
`checkpoint-400/`	Step 400 / 616 (65%)	18.75%
`checkpoint-600/`	Step 600 / 616 (97%)	19.95%
`final/`	Step 616 / 616 (100%)	~20%

For comparison: base Qwen3.5-2B on the same eval is 18.84%, and Qwen3.5-2B + privileged GT-demo teacher prompt is 26.46%.

Training config (16b_train_sdft_full_ft.py)

bf16, full-param FT (vision frozen)
bs=1, ga=32 (effective batch 32)
lr=5e-6, warmup=10 steps, cosine
AdamW, wd=0, max_grad_norm=1.0
ema-α=0.01, kl-temperature=2.0, reverse KL
2 epochs × 9,864 train samples = 616 optimizer steps
on-policy max_new_tokens=96, temperature=1.0

Loading

from transformers import AutoModelForImageTextToText, AutoProcessor
model = AutoModelForImageTextToText.from_pretrained(
    "Chengheng/webos-sdft-2b", subfolder="checkpoint-600",
    torch_dtype="bfloat16", device_map="cuda:0",
)
processor = AutoProcessor.from_pretrained(
    "Chengheng/webos-sdft-2b", subfolder="checkpoint-600",
)

Code

Training and eval scripts: https://github.com/ChenghengLi/WebOS

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Chengheng/webos-sdft-2b

Base model

Qwen/Qwen3.5-2B-Base

Finetuned

Qwen/Qwen3.5-2B

Finetuned

(158)

this model