webos-sdft-2b / README.md
Chengheng's picture
Upload README.md with huggingface_hub
3d67ea4 verified
---
license: apache-2.0
base_model: Qwen/Qwen3.5-2B
tags:
- gui-grounding
- sdft
- self-distillation
- qwen3.5
---
# WebOS SDFT 2B
Full-parameter SDFT (Self-Distillation Fine-Tuning, paper-faithful Shenfeld et al. 2601.19897 Algorithm 1) of Qwen3.5-2B on the WebOS GUI-grounding dataset.
## Layout
| Subfolder | Description | Click-acc on full test set |
|---|---|---|
| `checkpoint-200/` | Step 200 / 616 (32% of training) | 18.41% |
| `checkpoint-400/` | Step 400 / 616 (65%) | 18.75% |
| `checkpoint-600/` | Step 600 / 616 (97%) | **19.95%** |
| `final/` | Step 616 / 616 (100%) | ~20% |
For comparison: base Qwen3.5-2B on the same eval is 18.84%, and Qwen3.5-2B + privileged GT-demo teacher prompt is 26.46%.
## Training config (16b_train_sdft_full_ft.py)
- bf16, full-param FT (vision frozen)
- bs=1, ga=32 (effective batch 32)
- lr=5e-6, warmup=10 steps, cosine
- AdamW, wd=0, max_grad_norm=1.0
- ema-α=0.01, kl-temperature=2.0, reverse KL
- 2 epochs × 9,864 train samples = 616 optimizer steps
- on-policy max_new_tokens=96, temperature=1.0
## Loading
```python
from transformers import AutoModelForImageTextToText, AutoProcessor
model = AutoModelForImageTextToText.from_pretrained(
"Chengheng/webos-sdft-2b", subfolder="checkpoint-600",
torch_dtype="bfloat16", device_map="cuda:0",
)
processor = AutoProcessor.from_pretrained(
"Chengheng/webos-sdft-2b", subfolder="checkpoint-600",
)
```
## Code
Training and eval scripts: https://github.com/ChenghengLi/WebOS