WHB139426
/

Grounded-Video-LLM

Model card Files Files and versions

WHB139426 commited on Oct 7, 2024

Commit

9492a2e

·

verified ·

1 Parent(s): be990be

Create README.md

Files changed (1) hide show

README.md +30 -0

README.md ADDED Viewed

	@@ -0,0 +1,30 @@

+---
+license: mit
+language:
+- en
+---
+# Grounded-VideoLLM Model Card
+Grounded-VideoLLM is a Video-LLM adept at fine-grained temporal grounding, which not only excels in grounding tasks such as temporal sentence grounding, dense video captioning, and grounded VideoQA, but also shows great potential as a versatile video assistant for general video understanding.
+## Model details
+**Model date:**
+Grounded-VideoLLM-Phi3.5-Vision-Instruct was trained in Oct. 2024.
+**Paper or resources for more information:**
+[Paper](https://arxiv.org/abs/2410.03290), [Code](https://github.com/WHB139426/Grounded-Video-LLM)
+## Citation
+If you find our project useful, hope you can star our repo and cite our paper as follows:
+```
+@misc{wang2024groundedvideollm,
+    title={Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models},
+    author={Haibo Wang and Zhiyang Xu and Yu Cheng and Shizhe Diao and Yufan Zhou and Yixin Cao and Qifan Wang and Weifeng Ge and Lifu Huang},
+    year={2024},
+    eprint={2410.03290},
+    archivePrefix={arXiv},
+    primaryClass={cs.CV}
+}
+```