Seedream 4.0: Toward Next-generation Multimodal Image Generation Paper • 2509.20427 • Published Sep 24, 2025 • 84
VideoPrism Collection VideoPrism is a foundational video encoder that enables state-of-the-art performance on a large variety of video understanding tasks. • 5 items • Updated Mar 12 • 19
MedGemma Release Collection Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 9 items • Updated Mar 12 • 488
view article Article Supercharge your OCR Pipelines with Open Models +5 merve, ariG23498, davanstrien, hynky, andito, reach-vb, pcuenq • Oct 21, 2025 • 309
view article Article Public AI on Hugging Face Inference Providers 🔥 +4 Jolow, thelastjosh, celinah, julien-c, sbrandeis, Wauplin • Sep 17, 2025 • 24
VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 8 items • Updated Mar 2 • 244
Health AI Developer Foundations (HAI-DEF) Collection Groups models released for use in health AI by Google. Read more about HAI-DEF at http://goo.gle/hai-def • 22 items • Updated Mar 12 • 217
view article Article Welcome GPT OSS, the new open-source model family from OpenAI! +10 reach-vb, pcuenq, lewtun, clem, Rocketknight1, clefourrier, celinah, Wauplin, marcsun13, pagezyhf, ahadnagy, joaogante • Aug 5, 2025 • 513
view article Article The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare +1 aaditya, pminervini, clefourrier • Apr 19, 2024 • 196
view article Article Evaluating Audio Reasoning with Big Bench Audio mhillsmith, georgewritescode • Dec 20, 2024 • 29
view article Article Featherless AI on Hugging Face Inference Providers 🔥 +4 wxgeorge, pohnean-recursal, picocreator, celinah, Wauplin, sbrandeis • Jun 12, 2025 • 49
view article Article Vision Language Models (Better, faster, stronger) +3 merve, sergiopaniego, ariG23498, pcuenq, andito • May 12, 2025 • 611
Describe Anything Collection Multimodal Large Language Models for Detailed Localized Image and Video Captioning • 7 items • Updated 3 days ago • 62
view article Article State of open video generation models in Diffusers +1 sayakpaul, a-r-r-o-w, dn6 • Jan 27, 2025 • 70
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 ariG23498, merve, pcuenq, reach-vb • Mar 12, 2025 • 495
view article Article SigLIP 2: A better multilingual vision language encoder +1 ariG23498, merve, qubvel-hf • Feb 21, 2025 • 211
view article Article Welcome to Inference Providers on the Hub 🔥 +5 burkaygur, zeke, aton2006, hassanelmghari, sbrandeis, kramp, julien-c • Jan 28, 2025 • 495
view article Article Deploying Speech-to-Speech on Hugging Face +2 andito, derek-thomas, dmaniloff, eustlb • Oct 22, 2024 • 45