How to use longvideotool/LongVT-RFT with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("longvideotool/LongVT-RFT") model = AutoModelForImageTextToText.from_pretrained("longvideotool/LongVT-RFT")