WebOS SDFT 2B

Full-parameter SDFT (Self-Distillation Fine-Tuning, paper-faithful Shenfeld et al. 2601.19897 Algorithm 1) of Qwen3.5-2B on the WebOS GUI-grounding dataset.

Layout

Subfolder Description Click-acc on full test set
checkpoint-200/ Step 200 / 616 (32% of training) 18.41%
checkpoint-400/ Step 400 / 616 (65%) 18.75%
checkpoint-600/ Step 600 / 616 (97%) 19.95%
final/ Step 616 / 616 (100%) ~20%

For comparison: base Qwen3.5-2B on the same eval is 18.84%, and Qwen3.5-2B + privileged GT-demo teacher prompt is 26.46%.

Training config (16b_train_sdft_full_ft.py)

  • bf16, full-param FT (vision frozen)
  • bs=1, ga=32 (effective batch 32)
  • lr=5e-6, warmup=10 steps, cosine
  • AdamW, wd=0, max_grad_norm=1.0
  • ema-α=0.01, kl-temperature=2.0, reverse KL
  • 2 epochs × 9,864 train samples = 616 optimizer steps
  • on-policy max_new_tokens=96, temperature=1.0

Loading

from transformers import AutoModelForImageTextToText, AutoProcessor
model = AutoModelForImageTextToText.from_pretrained(
    "Chengheng/webos-sdft-2b", subfolder="checkpoint-600",
    torch_dtype="bfloat16", device_map="cuda:0",
)
processor = AutoProcessor.from_pretrained(
    "Chengheng/webos-sdft-2b", subfolder="checkpoint-600",
)

Code

Training and eval scripts: https://github.com/ChenghengLi/WebOS

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Chengheng/webos-sdft-2b

Finetuned
Qwen/Qwen3.5-2B
Finetuned
(158)
this model