NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 23 items • Updated 18 days ago • 330
PGC Psychiatric GWAS Summary Statistics Collection ~1 billion rows of genome-wide association study (GWAS) NOTE: We are in the process to transfer these datasets to the Psychiatric Genomics Consortiu • 12 items • Updated 8 days ago • 92
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model Paper • 2510.14528 • Published Oct 16, 2025 • 129
MediPhi Collection A collection of SLMs based on Phi3.5-mini-instruct adapted to clinical natural language processing tasks: https://arxiv.org/abs/2505.10717 • 10 items • Updated Oct 1, 2025 • 29
view article Article NVIDIA Releases Improved Pretraining Dataset: Preserves High Value Math & Code, and Augments with Multi-Lingual nvidia • Aug 18, 2025 • 4
view article Article 📢 NVIDIA Releases Nemotron-CC-Math Pre-Training Dataset: A High-Quality, Web-Scale Math Corpus for Pretraining Large Language Models nvidia • Aug 18, 2025 • 5
view article Article Supercharge Edge AI With High‑Accuracy Reasoning Using NVIDIA Nemotron Nano 2 9B nvidia • Aug 18, 2025 • 32
view article Article NVIDIA Releases 6 Million Multi-Lingual Reasoning Dataset nvidia • Aug 20, 2025 • 19
view article Article MCP for Research: How to Connect AI to Research Tools dylanebert • Aug 18, 2025 • 70
Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published Aug 5, 2025 • 141
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20, 2024 • 185
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience Paper • 2508.04700 • Published Aug 6, 2025 • 52
view article Article Learn the Hugging Face Kernel Hub in 5 Minutes +5 drbh, danieldk, Narsil, pcuenq, pagezyhf, merve, reach-vb • Jun 12, 2025 • 164