NU-World-Model-Embodied-AI
/

FlashWAM-RoboTwin

step-distillation

Model card Files Files and versions

FlashWAM-RoboTwin / README.md

armanakbari4's picture

Add model card

d26c24e verified 4 days ago

|

history blame contribute delete

1.89 kB

	---
	license: apache-2.0
	library_name: diffusers
	tags:
	- robotics
	- world-model
	- diffusion
	- step-distillation
	- lingbot-va
	pipeline_tag: robotics
	---

	# Flash-WAM — RoboTwin (distilled)

	Single-step distilled checkpoint for Flash-WAM: Modality-Aware Distillation for World Action Models, applied to LingBot-VA and evaluated on RoboTwin 2.0. Flash-WAM distills each modality with a consistency function matched to its noise regime (linear-gradient-scaling for the action stream, variance-preserving for the video stream), compressing inference to a single step per modality for up to a 23× speedup while preserving teacher-level task success.

	This repository contains the complete model (distilled transformer + encoders):

	\| Component \| Description \|
	\| :--- \| :--- \|
	\| `transformer/` \| Distilled Flash-WAM student \|
	\| `vae/` \| VAE (from the LingBot-VA teacher) \|
	\| `text_encoder/` \| UMT5-XXL text encoder (from the teacher) \|
	\| `tokenizer/` \| T5 tokenizer \|

	## Links

	- 📄 Paper: https://arxiv.org/abs/2606.05254
	- 🌐 Project page: https://flashwam.github.io
	- 💻 Code: https://github.com/NU-World-Model-Embodied-AI/Flash-WAM

	## Usage

	For environment setup and evaluation, follow the [Flash-WAM repository](https://github.com/NU-World-Model-Embodied-AI/Flash-WAM) and [LingBot-VA](https://github.com/Robbyant/lingbot-va). Point the inference server at this checkpoint directory.

	## Citation

	```bibtex
	@misc{akbari2026flashwammodalityawaredistillationworld,
	title={Flash-WAM: Modality-Aware Distillation for World Action Models},
	author={Arman Akbari and Ci Zhang and Arash Akbari and Lin Zhao and Yixiao Chen and Weiwei Chen and Xuan Zhang and Geng Yuan and Yanzhi Wang},
	year={2026},
	eprint={2606.05254},
	archivePrefix={arXiv},
	primaryClass={cs.LG},
	url={https://arxiv.org/abs/2606.05254},
	}
	```

	License: Apache-2.0.