nkkbr
/

ViCA2-init

Video-Text-to-Text

text-generation

vision-language

video understanding

visuospatial cognition

spatial reasoning

Model card Files Files and versions

Usage and Full Documentation

For detailed model description, training setup, datasets, evaluation results, and inference code, please refer to the following links:

Downloads last month: 3

Safetensors

Model size

8B params

Tensor type

F32

·

Inference Providers NEW

Video-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including nkkbr/ViCA2-init

ViCA2

5 items • Updated May 14, 2025