Video-Text-to-Text
Transformers
Safetensors
qwen2_5_omni
text-to-audio
multimodal
video-captioning
audio-visual
ugc
Instructions to use openinterx/UGC-VideoCaptioner with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openinterx/UGC-VideoCaptioner with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForTextToWaveform processor = AutoProcessor.from_pretrained("openinterx/UGC-VideoCaptioner") model = AutoModelForTextToWaveform.from_pretrained("openinterx/UGC-VideoCaptioner") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -65,7 +65,7 @@ Real-world user-generated videos, especially on platforms like TikTok, often fea
|
|
| 65 |
## Quick Start
|
| 66 |
|
| 67 |
You can use this model with the `transformers` library. Below is a quick example demonstrating how to perform inference.
|
| 68 |
-
Please note that for full video processing capabilities, you might need to install `decord` and refer to the [official GitHub repository](https://github.com/
|
| 69 |
|
| 70 |
### Environment Setup
|
| 71 |
|
|
|
|
| 65 |
## Quick Start
|
| 66 |
|
| 67 |
You can use this model with the `transformers` library. Below is a quick example demonstrating how to perform inference.
|
| 68 |
+
Please note that for full video processing capabilities, you might need to install `decord` and refer to the [official GitHub repository](https://github.com/WPR001/UGC_VideoCaptioner/tree/main?tab=readme-ov-file) for detailed video handling steps, especially if `AutoProcessor` doesn't directly handle video file paths for complex scenarios.
|
| 69 |
|
| 70 |
### Environment Setup
|
| 71 |
|