--- license: mit library_name: pytorch pipeline_tag: robotics tags: - robotics - vision-language-action - vla - libero - manipulation - qwen-vl --- # SemanticVLA · LIBERO > 🎉 **Accepted to [CVPR 2026](https://cvpr.thecvf.com/virtual/2026/poster/39352).** > ✍️ Fei Ni¹, Zhuo Chen², Yifu Yuan³, Zibin Dong³, Xianze Yao³, Shan Luo², Jianye Hao³, Jiankang Deng¹†, Stefanos Zafeiriou¹†
> 🏫 ¹Imperial College London    ²King's College London    ³Tianjin University
> ✉️ Primary contact: [f.ni@imperial.ac.uk](mailto:f.ni@imperial.ac.uk) [SemanticVLA](https://github.com/Fei-Ni/SemanticVLA_Offcial) finetuned on the [LIBERO](https://github.com/Lifelong-Robot-Learning/LIBERO) benchmark. The unified OXE LAM is used as the latent-action tokenizer, and the trace + latent-action auxiliary heads are supervised in the VLM's language stream. ## Headline result | Suite | Success rate | |---|---:| | `libero_spatial` | 0.988 | | `libero_object` | 0.996 | | `libero_goal` | 0.974 | | `libero_10` | 0.970 | | **4-suite mean** | **0.982** | ## Architecture | Component | Choice | |---|---| | VLM backbone | Qwen3-VL-4B-Instruct | | Action head | DiT-B (flow matching) | | LAM tokenizer | [`SemanticVLA-LAM`](https://huggingface.co/spikefly/SemanticVLA-LAM) (unified OXE LAM) | | Semantic supervision | Trace + latent action tokens predicted in the VLM's language stream; action decoder unmodified | | Latent vocabulary size | 32 | | Latent tokens per sample | 4 | | Action horizon | 8 | ## Files ``` SemanticVLA-LIBERO/ ├── README.md ├── config.yaml # loadable model config ├── dataset_statistics.json # action normalization stats └── final_model/ └── pytorch_model.pt # policy state_dict ``` ## How to load ```python from semanticvla.model.framework.base_framework import baseframework policy = baseframework.from_pretrained("pytorch_model.pt") policy.eval() ``` `baseframework.from_pretrained()` walks two directory levels up from the checkpoint file to locate `config.yaml` and `dataset_statistics.json`. The released layout follows this convention. To run a full LIBERO evaluation, see [`examples/LIBERO/`](https://github.com/Fei-Ni/SemanticVLA_Offcial/tree/main/examples/LIBERO) in the code repo. ## Sibling SemanticVLA checkpoint repos | Repo | Purpose | |---|---| | 🤗 [`SemanticVLA-LAM`](https://huggingface.co/spikefly/SemanticVLA-LAM) | Unified OXE LAM consumed by this policy | | 🤗 [`SemanticVLA-SimplerEnv`](https://huggingface.co/spikefly/SemanticVLA-SimplerEnv) | SimplerEnv WidowX policy | ## Related resources - **Code**: https://github.com/Fei-Ni/SemanticVLA_Offcial - **Datasets collection**: https://hf.co/collections/spikefly/semanticvla-datasets - **Model Zoo collection**: https://hf.co/collections/spikefly/semanticvla-model-zoo ## Citation ```bibtex @inproceedings{ni2026semanticvla, title = {SemanticVLA: Towards Semantic Reasoning over Action Memorization via Synergistic Explicit Trace and Latent Action Planning}, author = {Ni, Fei and Chen, Zhuo and Yuan, Yifu and Dong, Zibin and Yao, Xianze and Luo, Shan and Hao, Jianye and Deng, Jiankang and Zafeiriou, Stefanos}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year = {2026} } ``` ## License Released under the [MIT License](https://github.com/Fei-Ni/SemanticVLA_Offcial/blob/main/LICENSE).