File size: 981 Bytes
1539190
 
 
 
 
 
 
8b98171
1539190
 
 
 
 
 
 
 
7b72587
 
1539190
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
license: apache-2.0
pipeline_tag: image-to-video
tags:
- image-to-video
- robotics
- world-model
arxiv: 2603.17808
---

# EVA

This repository hosts the EVA checkpoint released with:

**EVA: Aligning Video World Models with Executable Robot Actions via Inverse Dynamics Rewards**

Project page: https://eva-project-page.github.io/  
arxiv: https://arxiv.org/abs/2603.17808

## Model Summary

This checkpoint is an EVA  model for robotic video planning.

It is built on top of a Wan2.1 I2V 14B backbone and further adapted on **Robotwin** through:

- supervised fine-tuning (SFT)
- reinforcement-learning-based post-training (RL)

The released checkpoint corresponds to the merged model after both stages of post-training.

## Intended Use

This model is intended for research use in robot video prediction and visual planning.

Given an input image and a language instruction, the model generates future video rollouts that are better aligned with executable robot behavior.