Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
melsiddieg
's Collections
DiffusionLLMs
Arudi
Biomedical
from_scratch_pretrain
bert and friends
Audiovisual
Research and Optimization
Visual and OCR
finetune_datasets
Audiovisual
updated
Apr 28
Upvote
-
Sort: Collection
microsoft/VibeVoice-1.5B
Text-to-Speech
•
3B
•
Updated
Jan 22
•
237k
•
2.41k
ibm-granite/granite-docling-258M
Image-Text-to-Text
•
0.3B
•
Updated
Sep 23, 2025
•
101k
•
1.2k
deepseek-ai/DeepSeek-OCR
Image-Text-to-Text
•
3B
•
Updated
Nov 4, 2025
•
2.2M
•
3.29k
Qwen/Qwen3-VL-2B-Thinking
Image-Text-to-Text
•
2B
•
Updated
Oct 20, 2025
•
62.9k
•
115
datalab-to/chandra
Image-Text-to-Text
•
9B
•
Updated
Mar 26
•
160k
•
527
Qwen/Qwen3-VL-2B-Instruct
Image-Text-to-Text
•
2B
•
Updated
Oct 23, 2025
•
2.12M
•
433
PokeeAI/pokee_research_7b
Text Generation
•
8B
•
Updated
Oct 23, 2025
•
25
•
•
100
openbmb/MiniCPM-o-4_5
Any-to-Any
•
9B
•
Updated
May 19
•
374k
•
1.41k
Qwen/Qwen3-ForcedAligner-0.6B
Automatic Speech Recognition
•
0.9B
•
Updated
Jan 30
•
465k
•
145
seemorg/books-ocr
Viewer
•
Updated
May 2, 2025
•
49.1k
•
33
•
5
Upvote
-
Sort: Collection
Share collection
View history
Collection guide
Browse collections