Update README.md
Browse files
README.md
CHANGED
|
@@ -6,4 +6,26 @@ tags:
|
|
| 6 |
- audiovisual
|
| 7 |
- video
|
| 8 |
- captioner
|
| 9 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
- audiovisual
|
| 7 |
- video
|
| 8 |
- captioner
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# AVoCaDO: An <u>A</u>udio<u>V</u>isual Vide<u>o</u> <u>Ca</u>ptioner <u>D</u>riven by Temporal <u>O</u>rchestration
|
| 12 |
+
|
| 13 |
+
<p align="left">
|
| 14 |
+
<a href="https://avocado-captioner.github.io/"><img src="https://img.shields.io/badge/Project%20webpage-558b2f?style=for-the-badge"></a>
|
| 15 |
+
<a href="https://github.com/AVoCaDO-Captioner/AVoCaDO"><img src="https://img.shields.io/badge/Github-db8905?style=for-the-badge"></a>
|
| 16 |
+
<a href="https://arxiv.org/abs/todo"><img src="https://img.shields.io/badge/arXiv-red?style=for-the-badge"></a>
|
| 17 |
+
</p>
|
| 18 |
+
|
| 19 |
+
## ✨ Overview
|
| 20 |
+
Audiovisual video captioning aims to generate semantically rich descriptions with temporal alignment between visual and auditory events, thereby benefiting both video understanding and generation. We introduce <b>AVoCaDO</b>, a powerful audiovisual video captioner driven by the temporal orchestration between audio and visual modalities. Experimental results demonstrate that AVoCaDO significantly outperforms existing open-source models across four audiovisual video captioning benchmarks, and also achieves competitive performance under visual-only settings.
|
| 21 |
+
|
| 22 |
+
## 🚀 Getting Started
|
| 23 |
+
Please refer to our [Github repository](https://github.com/AVoCaDO-Captioner/AVoCaDO) for more details.
|
| 24 |
+
|
| 25 |
+
## ✒️ Citation
|
| 26 |
+
|
| 27 |
+
If you find our work helpful for your research, please consider giving a star ⭐ and citing our paper. We appreciate your support!
|
| 28 |
+
|
| 29 |
+
```bibtex
|
| 30 |
+
todo
|
| 31 |
+
```
|