view article Article We Got Claude to Build CUDA Kernels and teach open models! +2 burtenshaw, evalstate, merve, pcuenq • Jan 28 • 156
view article Article 🪆 Introduction to Matryoshka Embedding Models +1 tomaarsen, Xenova, osanseviero • Feb 23, 2024 • 208
view article Article PP-OCRv5 on Hugging Face: A Specialized Approach to OCR baidu • Sep 10, 2025 • 111
view article Article Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs +7 wenhuach, Haihao, weiweiz1, n1ck-guo, isaacmac, kding1, IlyasMoutawwakil, marcsun13, medmekk • Apr 29, 2025 • 44
Executable Code Actions Elicit Better LLM Agents Paper • 2402.01030 • Published Feb 1, 2024 • 194
MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm Paper • 2502.02358 • Published Feb 4, 2025 • 19
ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization Paper • 2502.04306 • Published Feb 6, 2025 • 20
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community +1 Leyo, HugoLaurencon, VictorSanh • Apr 15, 2024 • 191
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published Jan 9, 2025 • 98
VideoRAG: Retrieval-Augmented Generation over Video Corpus Paper • 2501.05874 • Published Jan 10, 2025 • 75
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature Paper • 2501.07171 • Published Jan 13, 2025 • 55
Potential and Perils of Large Language Models as Judges of Unstructured Textual Data Paper • 2501.08167 • Published Jan 14, 2025 • 6
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14, 2025 • 303
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents Paper • 2501.08828 • Published Jan 15, 2025 • 30
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization Paper • 2412.18525 • Published Dec 24, 2024 • 74