Chengheng
/

webos-sdft-2b

self-distillation

Model card Files Files and versions

webos-sdft-2b / README.md

Chengheng's picture

Upload README.md with huggingface_hub

3d67ea4 verified 22 days ago

|

history blame contribute delete

1.5 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen3.5-2B
	tags:
	- gui-grounding
	- sdft
	- self-distillation
	- qwen3.5
	---

	# WebOS SDFT 2B

	Full-parameter SDFT (Self-Distillation Fine-Tuning, paper-faithful Shenfeld et al. 2601.19897 Algorithm 1) of Qwen3.5-2B on the WebOS GUI-grounding dataset.

	## Layout

	\| Subfolder \| Description \| Click-acc on full test set \|
	\|---\|---\|---\|
	\| `checkpoint-200/` \| Step 200 / 616 (32% of training) \| 18.41% \|
	\| `checkpoint-400/` \| Step 400 / 616 (65%) \| 18.75% \|
	\| `checkpoint-600/` \| Step 600 / 616 (97%) \| 19.95% \|
	\| `final/` \| Step 616 / 616 (100%) \| ~20% \|

	For comparison: base Qwen3.5-2B on the same eval is 18.84%, and Qwen3.5-2B + privileged GT-demo teacher prompt is 26.46%.

	## Training config (16b_train_sdft_full_ft.py)
	- bf16, full-param FT (vision frozen)
	- bs=1, ga=32 (effective batch 32)
	- lr=5e-6, warmup=10 steps, cosine
	- AdamW, wd=0, max_grad_norm=1.0
	- ema-α=0.01, kl-temperature=2.0, reverse KL
	- 2 epochs × 9,864 train samples = 616 optimizer steps
	- on-policy max_new_tokens=96, temperature=1.0

	## Loading

	```python
	from transformers import AutoModelForImageTextToText, AutoProcessor
	model = AutoModelForImageTextToText.from_pretrained(
	"Chengheng/webos-sdft-2b", subfolder="checkpoint-600",
	torch_dtype="bfloat16", device_map="cuda:0",
	)
	processor = AutoProcessor.from_pretrained(
	"Chengheng/webos-sdft-2b", subfolder="checkpoint-600",
	)
	```

	## Code

	Training and eval scripts: https://github.com/ChenghengLi/WebOS