--- license: apache-2.0 base_model: Qwen/Qwen3.5-2B tags: - gui-grounding - sdft - self-distillation - qwen3.5 --- # WebOS SDFT 2B Full-parameter SDFT (Self-Distillation Fine-Tuning, paper-faithful Shenfeld et al. 2601.19897 Algorithm 1) of Qwen3.5-2B on the WebOS GUI-grounding dataset. ## Layout | Subfolder | Description | Click-acc on full test set | |---|---|---| | `checkpoint-200/` | Step 200 / 616 (32% of training) | 18.41% | | `checkpoint-400/` | Step 400 / 616 (65%) | 18.75% | | `checkpoint-600/` | Step 600 / 616 (97%) | **19.95%** | | `final/` | Step 616 / 616 (100%) | ~20% | For comparison: base Qwen3.5-2B on the same eval is 18.84%, and Qwen3.5-2B + privileged GT-demo teacher prompt is 26.46%. ## Training config (16b_train_sdft_full_ft.py) - bf16, full-param FT (vision frozen) - bs=1, ga=32 (effective batch 32) - lr=5e-6, warmup=10 steps, cosine - AdamW, wd=0, max_grad_norm=1.0 - ema-α=0.01, kl-temperature=2.0, reverse KL - 2 epochs × 9,864 train samples = 616 optimizer steps - on-policy max_new_tokens=96, temperature=1.0 ## Loading ```python from transformers import AutoModelForImageTextToText, AutoProcessor model = AutoModelForImageTextToText.from_pretrained( "Chengheng/webos-sdft-2b", subfolder="checkpoint-600", torch_dtype="bfloat16", device_map="cuda:0", ) processor = AutoProcessor.from_pretrained( "Chengheng/webos-sdft-2b", subfolder="checkpoint-600", ) ``` ## Code Training and eval scripts: https://github.com/ChenghengLi/WebOS