Collections

Discover the best community collections!

Collections trending this week
GIT
GIT (Generative Image-to-text Transformer) is a model useful for vision-language tasks such as image/video captioning and question answering.
ARCH models, benchmark and paper
This collection contains pre-trained models on the AudioSet dataset, offering a diverse set of features for audio representation learning.
Top LLM
Collection of TOP Open Source LLM, Sort by Best on top
LLaVA-Video
Models focus on video understanding (previously known as LLaVA-NeXT-Video).
Top LLM
Collection of TOP Open Source LLM, Sort by Best on top
LLaVA-Video
Models focus on video understanding (previously known as LLaVA-NeXT-Video).
GIT
GIT (Generative Image-to-text Transformer) is a model useful for vision-language tasks such as image/video captioning and question answering.
ARCH models, benchmark and paper
This collection contains pre-trained models on the AudioSet dataset, offering a diverse set of features for audio representation learning.