Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
nvidia
/
omnivinci
like
176
Follow
NVIDIA
51.3k
Feature Extraction
Transformers
Safetensors
vila
omni-modal
multimodal
vision
audio
video
llm
custom_code
Eval Results (legacy)
arxiv:
2510.15870
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
6
Deploy
Use this model
main
omnivinci
/
qwen_audio_encoder.py
Commit History
commit
c48c32c
Hanrong Ye
commited on
about 1 month ago