File size: 1,500 Bytes
3d67ea4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | ---
license: apache-2.0
base_model: Qwen/Qwen3.5-2B
tags:
- gui-grounding
- sdft
- self-distillation
- qwen3.5
---
# WebOS SDFT 2B
Full-parameter SDFT (Self-Distillation Fine-Tuning, paper-faithful Shenfeld et al. 2601.19897 Algorithm 1) of Qwen3.5-2B on the WebOS GUI-grounding dataset.
## Layout
| Subfolder | Description | Click-acc on full test set |
|---|---|---|
| `checkpoint-200/` | Step 200 / 616 (32% of training) | 18.41% |
| `checkpoint-400/` | Step 400 / 616 (65%) | 18.75% |
| `checkpoint-600/` | Step 600 / 616 (97%) | **19.95%** |
| `final/` | Step 616 / 616 (100%) | ~20% |
For comparison: base Qwen3.5-2B on the same eval is 18.84%, and Qwen3.5-2B + privileged GT-demo teacher prompt is 26.46%.
## Training config (16b_train_sdft_full_ft.py)
- bf16, full-param FT (vision frozen)
- bs=1, ga=32 (effective batch 32)
- lr=5e-6, warmup=10 steps, cosine
- AdamW, wd=0, max_grad_norm=1.0
- ema-α=0.01, kl-temperature=2.0, reverse KL
- 2 epochs × 9,864 train samples = 616 optimizer steps
- on-policy max_new_tokens=96, temperature=1.0
## Loading
```python
from transformers import AutoModelForImageTextToText, AutoProcessor
model = AutoModelForImageTextToText.from_pretrained(
"Chengheng/webos-sdft-2b", subfolder="checkpoint-600",
torch_dtype="bfloat16", device_map="cuda:0",
)
processor = AutoProcessor.from_pretrained(
"Chengheng/webos-sdft-2b", subfolder="checkpoint-600",
)
```
## Code
Training and eval scripts: https://github.com/ChenghengLi/WebOS
|