--- license: apache-2.0 pipeline_tag: image-to-video tags: - image-to-video - robotics - world-model arxiv: 2603.17808 --- # EVA This repository hosts the EVA checkpoint released with: **EVA: Aligning Video World Models with Executable Robot Actions via Inverse Dynamics Rewards** Project page: https://eva-project-page.github.io/ arxiv: https://arxiv.org/abs/2603.17808 ## Model Summary This checkpoint is an EVA model for robotic video planning. It is built on top of a Wan2.1 I2V 14B backbone and further adapted on **Robotwin** through: - supervised fine-tuning (SFT) - reinforcement-learning-based post-training (RL) The released checkpoint corresponds to the merged model after both stages of post-training. ## Intended Use This model is intended for research use in robot video prediction and visual planning. Given an input image and a language instruction, the model generates future video rollouts that are better aligned with executable robot behavior.