Video-Text-to-Text
Transformers
Safetensors
English
qwen3_vl
image-text-to-text
video-grounding
temporal-grounding
video-understanding
qwen3-vl
Instructions to use TencentARC/TimeLens-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TencentARC/TimeLens-8B with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("TencentARC/TimeLens-8B") model = AutoModelForImageTextToText.from_pretrained("TencentARC/TimeLens-8B") - Notebooks
- Google Colab
- Kaggle
Fix processor resize defaults for vLLM compatibility
Browse filesRemove explicit `do_resize: false` from the video and image processor configs so the checkpoint falls back to the upstream Qwen3-VL default behavior. This keeps Transformers examples unchanged when `do_resize=False` is passed explicitly, while restoring compatibility for runtimes like vLLM that rely on the saved processor defaults.
preprocessor_config.json
CHANGED
|
@@ -9,7 +9,6 @@
|
|
| 9 |
"do_normalize": true,
|
| 10 |
"do_pad": null,
|
| 11 |
"do_rescale": true,
|
| 12 |
-
"do_resize": false,
|
| 13 |
"image_mean": [
|
| 14 |
0.5,
|
| 15 |
0.5,
|
|
|
|
| 9 |
"do_normalize": true,
|
| 10 |
"do_pad": null,
|
| 11 |
"do_rescale": true,
|
|
|
|
| 12 |
"image_mean": [
|
| 13 |
0.5,
|
| 14 |
0.5,
|
video_preprocessor_config.json
CHANGED
|
@@ -7,7 +7,6 @@
|
|
| 7 |
"do_convert_rgb": true,
|
| 8 |
"do_normalize": true,
|
| 9 |
"do_rescale": true,
|
| 10 |
-
"do_resize": false,
|
| 11 |
"do_sample_frames": true,
|
| 12 |
"fps": 2,
|
| 13 |
"image_mean": [
|
|
|
|
| 7 |
"do_convert_rgb": true,
|
| 8 |
"do_normalize": true,
|
| 9 |
"do_rescale": true,
|
|
|
|
| 10 |
"do_sample_frames": true,
|
| 11 |
"fps": 2,
|
| 12 |
"image_mean": [
|