view article Article Kog Laneformer 2B: The Latency-First Model Behind Kog Inference Engine kogai • 2 days ago • 27
view article Article Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains JetBrains • 25 days ago • 32
view article Article SmolLM3: smol, multilingual, long-context reasoner +21 eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf • Jul 8, 2025 • 780
TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment Paper • 2602.23068 • Published Feb 26 • 8
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published Jan 31 • 325
SLIME: Stabilized Likelihood Implicit Margin Enforcement for Preference Optimization Paper • 2602.02383 • Published Feb 2 • 30
Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation Paper • 2601.22813 • Published Jan 30 • 63
Simple Projection Variants Improve ColBERT Performance Paper • 2510.12327 • Published Oct 14, 2025 • 7
view article Article TFLOPS Gap: Why FP4 MoE Kernel Engineering Matters on Blackwell apsys • Jan 5 • 14
Mellum: Production-Grade in-IDE Contextual Code Completion with Multi-File Project Understanding Paper • 2510.05788 • Published Oct 7, 2025 • 4
T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground Paper • 2512.10430 • Published Dec 11, 2025 • 120
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published Nov 19, 2025 • 234
TokensGen: Harnessing Condensed Tokens for Long Video Generation Paper • 2507.15728 • Published Jul 21, 2025 • 8
NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining Paper • 2507.14119 • Published Jul 18, 2025 • 60
T-LoRA: Single Image Diffusion Model Customization Without Overfitting Paper • 2507.05964 • Published Jul 8, 2025 • 121
SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement Paper • 2506.07634 • Published Jun 9, 2025 • 6