Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
nkkbr
/
ViCA2-stage1-align
like
0
Video-Text-to-Text
Transformers
Safetensors
sam2
liuhaotian/LLaVA-CC3M-Pretrain-595K
English
vica_qwen
text-generation
multimodal
vision-language
video understanding
visuospatial cognition
spatial reasoning
vlm
llava
qwen
siglip
hiera
dual-encoder
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
ViCA2-stage1-align
Commit History
Create README.md
653b4ae
verified
nkkbr
commited on
May 15, 2025
Initial commit
b1ee778
nkkbr
commited on
Apr 21, 2025
initial commit
8a675c1
verified
nkkbr
commited on
Apr 21, 2025