TIGER-Lab
/

VISTA-LongVA

Video-Text-to-Text

Model card Files Files and versions

VISTA-LongVA / README.md

nielsr's picture

nielsr HF Staff

Add pipeline tag, add link to paper

2300c62 verified about 1 year ago

|

255 Bytes

	---
	license: mit
	pipeline_tag: video-text-to-text
	---

	This repository contains the model described in [VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation](https://huggingface.co/papers/2412.00927).