TIPSv2 Collection TIPSv2 foundational vision-language models. Webpage: https://gdm-tipsv2.github.io/ • 9 items • Updated 29 days ago • 30
TADA Collection TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment | https://huggingface.co/papers/2602.23068 • 7 items • Updated Mar 24 • 71
view article Article NEO-unify: Building Native Multimodal Unified Models End to End sensenova • Mar 5 • 159
view article Article Learn the Hugging Face Kernel Hub in 5 Minutes +5 drbh, danieldk, Narsil, pcuenq, pagezyhf, merve, reach-vb • Jun 12, 2025 • 164
view article Article Mixture of Experts (MoEs) in Transformers +5 ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap • Feb 26 • 159
Scaling Embeddings Outperforms Scaling Experts in Language Models Paper • 2601.21204 • Published Jan 29 • 103
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated Dec 2, 2025 • 167
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 lysandre, ArthurZ, cyrilvallez, reach-vb • Dec 1, 2025 • 310
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27, 2025 • 245
view article Article The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix codelion • Nov 3, 2025 • 65
view article Article Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers +5 ariG23498, sergiopaniego, reach-vb, pcuenq, ArthurZ, SaylorTwift, cyrilvallez • Sep 11, 2025 • 187
SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models Paper • 2411.05007 • Published Nov 7, 2024 • 24
view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels drbh, danieldk • Aug 18, 2025 • 97