TinyLLaVA-Video-R1 / README.md
nielsr's picture
nielsr HF Staff
Add library name to model card
1b174c1 verified
|
raw
history blame
1.14 kB
metadata
license: apache-2.0
pipeline_tag: video-text-to-text
library_name: transformers

TinyLLaVA-Video-R1

arXivGithub

Here, we introduce a small-scale video reasoning model TinyLLaVA-Video-R1, based on the traceably trained model TinyLLaVA-Video. After reinforcement learning on general Video-QA datasets, the model not only significantly improves its reasoning and thinking abilities, but also exhibits the emergent characteristic of “aha moments”.

Result

Model (HF Path) Video-MME MVBench MLVU MMVU
Zhang199/TinyLLaVA-Video-R1 46.6 49.5 52.4 46.9