Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
nkaushik
's Collections
Visual Understanding Models
Backlog to try
AudioRefinement
Backlog to try
updated
24 days ago
Upvote
-
fashn-ai/fashn-vton-1.5
Image-to-Image
•
Updated
Feb 1
•
91
unsloth/Z-Image-GGUF
Text-to-Image
•
6B
•
Updated
Jan 28
•
14.8k
•
136
Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice
Text-to-Speech
•
2B
•
Updated
Jan 29
•
991k
•
1.32k
Qwen/Qwen3-TTS-12Hz-1.7B-Base
Updated
Jan 23
•
1.86M
•
351
unsloth/FLUX.2-klein-9B-GGUF
Image-to-Image
•
9B
•
Updated
Jan 16
•
60.2k
•
147
unsloth/FLUX.2-klein-base-4B-GGUF
Image-to-Image
•
4B
•
Updated
Jan 15
•
2.9k
•
15
unsloth/FLUX.2-klein-base-9B-GGUF
Image-to-Image
•
9B
•
Updated
Jan 15
•
7.26k
•
29
DevParker/VibeVoice7b-low-vram
Text-to-Speech
•
Updated
Oct 23, 2025
•
62
ACE-Step/Ace-Step1.5
Text-to-Audio
•
Updated
Feb 3
•
34.7k
•
659
circlestone-labs/Anima
Updated
14 days ago
•
255k
•
934
PaddlePaddle/PaddleOCR-VL-1.5
Image-Text-to-Text
•
1.0B
•
Updated
9 days ago
•
182k
•
501
sarvamai/sarvam-1
Text Generation
•
3B
•
Updated
Nov 8, 2024
•
8.28k
•
133
lightonai/LightOnOCR-2-1B
Image-Text-to-Text
•
1B
•
Updated
Feb 20
•
528k
•
610
PaddlePaddle/PaddleOCR-VL
Image-Text-to-Text
•
1.0B
•
Updated
about 2 hours ago
•
8.34k
•
1.58k
LocoreMind/LocoOperator-4B
Text Generation
•
4B
•
Updated
30 days ago
•
17.5k
•
206
unsloth/Qwen3.5-9B-GGUF
Image-Text-to-Text
•
9B
•
Updated
24 days ago
•
1.34M
•
411
Upvote
-
Share collection
View history
Collection guide
Browse collections