Update README.md
Browse files
README.md
CHANGED
|
@@ -7,4 +7,9 @@ pipeline_tag: image-text-to-text
|
|
| 7 |
|
| 8 |
[](https://github.com/ZhangXJ199/TinyLLaVA-Video-R1)[](https://github.com/ZhangXJ199/TinyLLaVA-Video-R1)
|
| 9 |
|
|
|
|
| 10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
|
| 8 |
[](https://github.com/ZhangXJ199/TinyLLaVA-Video-R1)[](https://github.com/ZhangXJ199/TinyLLaVA-Video-R1)
|
| 9 |
|
| 10 |
+
Here, we introduce a small-scale video reasoning model TinyLLaVA-Video-R1, based on the traceably trained model [TinyLLaVA-Video](https://github.com/ZhangXJ199/TinyLLaVA-Video). After reinforcement learning on general Video-QA datasets, the model not only significantly improves its reasoning and thinking abilities, but also exhibits the emergent characteristic of “aha moments”.
|
| 11 |
|
| 12 |
+
### Result
|
| 13 |
+
| Model (HF Path) | Video-MME | MVBench | MLVU | MMVU |
|
| 14 |
+
| :----------------------------------------: | ------------- | ------- | -------------- | ---------- |
|
| 15 |
+
| [Zhang199/TinyLLaVA-Video-Qwen2.5-3B-Group-1fps-512](https://huggingface.co/Zhang199/TinyLLaVA-Video-Qwen2.5-3B-Group-1fps-512) | 46.6 | 49.5 | 52.4 | 46.9 |
|