WRHC
/

EfficientVideoAgent

Video-Text-to-Text

image-text-to-text

text-generation-inference

Model card Files Files and versions

WRHC commited on 3 days ago

Commit

d311e03

·

1 Parent(s): 554d88f

update readme

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -5,11 +5,12 @@ library_name: transformers
 # EVA: Efficient Reinforcement Learning for End-to-End Video Agent
-[![Paper](https://img.shields.io/badge/Paper-Link-b31b1b.svg)](https://arxiv.org/abs/2603.22918)
-[![GitHub](https://img.shields.io/badge/GitHub-Repository-black.svg)](https://github.com/wangruohui/EfficientVideoAgent)
-[![Model](https://img.shields.io/badge/Model-Link-blue.svg)](https://huggingface.co/WRHC/EfficientVideoAgent/)
-This repository contains the official evaluation code for the model proposed in the paper [EVA: Efficient Reinforcement Learning for End-to-End Video Agent](https://arxiv.org/abs/2603.22918).
 EVA (Efficient Video Agent) is an end-to-end framework that enables "planning-before-perception" through iterative summary-plan-action-reflection reasoning. Unlike passive recognizers, EVA autonomously decides what to watch, when to watch, and how to watch, achieving query-driven and efficient video understanding.

 # EVA: Efficient Reinforcement Learning for End-to-End Video Agent
+[![Paper](https://img.shields.io/badge/Paper-2603.22918-b31b1b.svg)](https://arxiv.org/abs/2603.22918)
+[![Paper](https://img.shields.io/badge/Paper-2603.22918-yellow.svg)](https://huggingface.co/papers/2603.22918)
+[![GitHub](https://img.shields.io/badge/GitHub-EfficientVideoAgent-black.svg)](https://github.com/wangruohui/EfficientVideoAgent)
+[![Model](https://img.shields.io/badge/Model-EfficientVideoAgent-blue.svg)](https://huggingface.co/WRHC/EfficientVideoAgent/)
+This repository contains the model weights proposed in our paper [EVA: Efficient Reinforcement Learning for End-to-End Video Agent](https://arxiv.org/abs/2603.22918). Official evaluation codes are hosted on [GitHub](https://github.com/wangruohui/EfficientVideoAgent).
 EVA (Efficient Video Agent) is an end-to-end framework that enables "planning-before-perception" through iterative summary-plan-action-reflection reasoning. Unlike passive recognizers, EVA autonomously decides what to watch, when to watch, and how to watch, achieving query-driven and efficient video understanding.