Multimodal LLMs
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding