view article Article PhysicsIntern: from an Autonomous Benchmark-runner to a Research Sidekick dlouapre • 17 days ago • 7
view article Article Designing the hf CLI as an agent-optimized way to work with the Hub celinah, Wauplin • 25 days ago • 58
view article Article Harness, Scaffold, and the AI Agent Terms Worth Getting Right sergiopaniego, ariG23498 • May 25 • 124
🤏 Smol-Data Collection Tried and tested mixes for strong pretraining. Inspired by https://huggingface.co/blog/codelion/optimal-dataset-mixing • 14 items • Updated Mar 2 • 13
view article Article Compute and Competition in AI: Different FlOPs for Different Folks sasha • Feb 12 • 16
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face +3 abidlabs, znation, nouamanetazi, sasha, qgallouedec • Jul 29, 2025 • 225
Common Pile v0.1 Collection All resources related to Common Pile v0.1, an 8TB dataset of public domain and openly licensed text • 4 items • Updated Jun 6, 2025 • 41