| | --- |
| | license: mit |
| | language: |
| | - en |
| | pipeline_tag: robotics |
| | --- |
| | |
| | <div align="center"> |
| |
|
| | <p align="center"> |
| | <img src="villa-x-transparent.png" width="400"/> |
| | |
| | </p> |
| |
|
| | <h1>villa-X: A Vision-Language-Latent-Action Model</h1> |
| |
|
| | [](https://arxiv.org/abs/2507.23682)   [](https://microsoft.github.io/villa-x)   [](https://github.com/microsoft/villa-x/) |
| | </div> |
| |
|
| | ## How to use |
| | Check out [https://github.com/microsoft/villa-x/](https://github.com/microsoft/villa-x/) |