juyil commited on
Commit
c656565
Β·
verified Β·
1 Parent(s): 79257b6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +69 -1
README.md CHANGED
@@ -7,4 +7,72 @@ sdk: static
7
  pinned: false
8
  ---
9
 
10
- Edit this `README.md` markdown file to author your organization card.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  pinned: false
8
  ---
9
 
10
+ # VMEvalKit πŸŽ₯🧠
11
+
12
+
13
+ <div align="center">
14
+
15
+
16
+ [![results](https://img.shields.io/badge/Result-A42C2?style=for-the-badge&logo=googledisplayandvideo360&logoColor=white)](https://grow-ai-like-a-child.com/video-reason/)
17
+ [![Paper](https://img.shields.io/badge/Paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white)](https://github.com/hokindeng/VMEvalKit/paper/video-models-start-to-solve/Video_Model_Start_to_Solve.pdf)
18
+ [![Hugging Face](https://img.shields.io/badge/hf-fcd022?style=for-the-badge&logo=huggingface&logoColor=white)](https://huggingface.co/VideoReason)
19
+ [![WeChat](https://img.shields.io/badge/WeChat-07C160?style=for-the-badge&logo=wechat&logoColor=white)](https://github.com/hokindeng/VMEvalKit/issues/132)
20
+
21
+
22
+ </div>
23
+
24
+ A framework to evaluate reasoning capabilities in video generation models at scale.
25
+
26
+ <p align="center">
27
+
28
+ </p>
29
+
30
+ ![VMEvalKit Framework](https://github.com/hokindeng/VMEvalKit/paper/video-models-start-to-solve/assets/draft_1.jpg)
31
+
32
+
33
+ ## 🎬 Supported Models
34
+
35
+ VMEvalKit provides unified access to **40 video generation models** across **11 provider families**:
36
+
37
+ For commercial APIs, we support Luma Dream Machine, Google Veo, Google Veo 3.1, WaveSpeed WAN 2.1, WaveSpeed WAN 2.2, Runway ML, OpenAI Sora. For open-source models, we support HunyuanVideo, VideoCrafter, DynamiCrafter, Stable Video Diffusion, Morphic, LTX-Video, and so on. See [here](docs/models/README.md) for details.
38
+
39
+ ## Invitation to Collaborate 🀝
40
+
41
+ VMEvalKit is meant to be a permissively open-source **shared playground** for everyone. If you’re interested in machine cognition, video models, evaluation, or anything anything πŸ¦„βœ¨, we’d love to build with you:
42
+
43
+ * πŸ§ͺ Add new reasoning tasks (planning, causality, social, physical, etc.)
44
+ * πŸŽ₯ Plug in new video models (APIs or open-source)
45
+ * πŸ“Š Experiment with better evaluation metrics and protocols
46
+ * 🧱 Improve infrastructure, logging, and the web dashboard
47
+ * πŸ“š Use VMEvalKit in your own research and share back configs/scripts
48
+ * πŸŒŸπŸŽ‰ Or Anything anything πŸ¦„βœ¨
49
+
50
+ πŸ’¬ **Join us on Slack** to ask questions, propose ideas, or start a collab:
51
+ [Slack Invite](https://join.slack.com/t/growingailikeachild/shared_invite/zt-309yqd0sl-W8xzOkdBPha1Jh5rnee78A) πŸš€
52
+
53
+
54
+ ## Research
55
+
56
+ Here we keep track of papers spinned off from this code infrastructure and some works in progress.
57
+
58
+ - [**"Video Models Start to Solve Chess, Maze, Sudoku, Mental Rotation, and Raven's Matrices"**](paper/video-models-start-to-solve/Video_Model_Start_to_Solve.pdf)
59
+
60
+ This paper implements our experimental framework and demonstrates that leading video generation models (Sora-2 etc) can perform visual reasoning tasks with >60% success rates. See [**results**](https://grow-ai-like-a-child.com/video-reason/).
61
+
62
+ ## License
63
+
64
+ Apache 2.0
65
+
66
+
67
+ ## Citation
68
+
69
+ If you find VMEvalKit useful in your research, please cite:
70
+
71
+ ```bibtex
72
+ @misc{VMEvalKit,
73
+ author = {VMEvalKit Team},
74
+ title = {VMEvalKit: A framework for evaluating reasoning abilities in foundational video models},
75
+ year = {2025},
76
+ howpublished = {\url{https://github.com/Video-Reason/VMEvalKit}}
77
+ }
78
+ ```