Update README.md
Browse files
README.md
CHANGED
|
@@ -17,9 +17,9 @@ Based on Qwen2.5 language model, it is trained on text, image, video and audio d
|
|
| 17 |
|
| 18 |
Ola offers an on-demand solution to seamlessly and efficiently process visual inputs with arbitrary spatial sizes and temporal lengths.
|
| 19 |
|
| 20 |
-
- **Repository:** https://github.com/
|
| 21 |
- **Languages:** English, Chinese
|
| 22 |
-
- **Paper:** https://arxiv.org/abs/
|
| 23 |
|
| 24 |
## Use
|
| 25 |
|
|
@@ -314,3 +314,9 @@ def ola_inference(multimodal, audio_path):
|
|
| 314 |
- **Code:** Pytorch
|
| 315 |
|
| 316 |
## Citation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
|
| 18 |
Ola offers an on-demand solution to seamlessly and efficiently process visual inputs with arbitrary spatial sizes and temporal lengths.
|
| 19 |
|
| 20 |
+
- **Repository:** https://github.com/Ola-Omni/Ola
|
| 21 |
- **Languages:** English, Chinese
|
| 22 |
+
- **Paper:** https://arxiv.org/abs/2502.04328
|
| 23 |
|
| 24 |
## Use
|
| 25 |
|
|
|
|
| 314 |
- **Code:** Pytorch
|
| 315 |
|
| 316 |
## Citation
|
| 317 |
+
@article{liu2025ola,
|
| 318 |
+
title={Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment},
|
| 319 |
+
author={Liu, Zuyan and Dong, Yuhao and Wang, Jiahui and Liu, Ziwei and Hu, Winston and Lu, Jiwen and Rao, Yongming},
|
| 320 |
+
journal={arXiv preprint arXiv:2502.04328},
|
| 321 |
+
year={2025}
|
| 322 |
+
}
|