Transformers How to use longvideotool/LongVT-RL with Transformers:
# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText
processor = AutoProcessor.from_pretrained("longvideotool/LongVT-RL")
model = AutoModelForImageTextToText.from_pretrained("longvideotool/LongVT-RL")