Video-Text-to-Text
Transformers
Safetensors
English
qwen2_5_vl
image-text-to-text
video-understanding
reasoning
multimodal
reinforcement-learning
question-answering
text-generation-inference
Instructions to use Falconss1/VideoThinker-R1-3B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Falconss1/VideoThinker-R1-3B with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("Falconss1/VideoThinker-R1-3B") model = AutoModelForImageTextToText.from_pretrained("Falconss1/VideoThinker-R1-3B") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -32,10 +32,10 @@ For training and evaluation, please refer to the Code: https://github.com/falons
|
|
| 32 |
|
| 33 |
If you find this project useful in your research, please consider cite:
|
| 34 |
```BibTeX
|
| 35 |
-
@
|
| 36 |
-
title={
|
| 37 |
-
author={
|
| 38 |
-
|
| 39 |
-
year={
|
| 40 |
}
|
| 41 |
```
|
|
|
|
| 32 |
|
| 33 |
If you find this project useful in your research, please consider cite:
|
| 34 |
```BibTeX
|
| 35 |
+
@inproceedings{wu2026videothinker,
|
| 36 |
+
title={Beyond Perceptual Shortcuts: Causal-Inspired Debiasing Optimization for Generalizable Video Reasoning in Lightweight MLLMs},
|
| 37 |
+
author={Wu, Jingze and Zhang, Quan and Suo, Hongfei and Cai, Zeqiang and Chen, Hongbo},
|
| 38 |
+
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
|
| 39 |
+
year={2026}
|
| 40 |
}
|
| 41 |
```
|