Video-Text-to-Text
Transformers
Safetensors
English
qwen3_vl
image-text-to-text
video-grounding
temporal-grounding
video-understanding
qwen3-vl
Instructions to use TencentARC/TimeLens-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TencentARC/TimeLens-8B with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("TencentARC/TimeLens-8B") model = AutoModelForImageTextToText.from_pretrained("TencentARC/TimeLens-8B") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -215,5 +215,10 @@ print(f"Answer: {answer}")
|
|
| 215 |
If you find our work helpful for your research and applications, please cite our paper:
|
| 216 |
|
| 217 |
```bibtex
|
| 218 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 219 |
```
|
|
|
|
| 215 |
If you find our work helpful for your research and applications, please cite our paper:
|
| 216 |
|
| 217 |
```bibtex
|
| 218 |
+
@article{zhang2025timelens,
|
| 219 |
+
title={TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs},
|
| 220 |
+
author={Zhang, Jun and Wang, Teng and Ge, Yuying and Ge, Yixiao and Li, Xinhao and Shan, Ying and Wang, Limin},
|
| 221 |
+
journal={arXiv preprint arXiv:2512.14698},
|
| 222 |
+
year={2025}
|
| 223 |
+
}
|
| 224 |
```
|