OpenThinker-Agent2 Collection OpenThinker-Agent2: agentic SFT/RL datasets and 8B/32B models (cold-start SFT, RL, and the OpenThinkerAgent-32B release). • 11 items • Updated 19 days ago • 8
view article Article Building Moon Bot: A Slack-Native Coding Agent Backed by HuggingFace Buckets huggingface • 6 days ago • 42
view article Article I fine-tuned a model for free from one prompt, with TRL and the Google Colab CLI sergiopaniego • 15 days ago • 4
view article Article Harness, Scaffold, and the AI Agent Terms Worth Getting Right sergiopaniego, ariG23498 • May 25 • 124
view article Article The Open Source Community is backing OpenEnv for Agentic RL +17 burtenshaw, spisakjo, lysandre, darktex, willcb, qjoy, pawalt, cwing-nv, danielhanchen, andrewzhou, thegovind, shimmyshimmer, Hamid-Nazeri, Sanyam, zkwentz, emre0, lewtun, sergiopaniego, banghua • 23 days ago • 101
view article Article Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler +3 ariG23498, sayakpaul, sergiopaniego, ror, pcuenq • May 29 • 130
Repo2RLEnv — Verifiable RL Environments Collection Verifiable RL environments built from real GitHub repos. One dataset per pipeline. Source: https://github.com/huggingface/Repo2RLEnv • 5 items • Updated 12 days ago • 1
view article Article Shipping a Trillion Parameters With a Hub Bucket: Delta Weight Sync in TRL +6 aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, lvwerra, sergiopaniego • May 27 • 42
RFDetr Collection RF-DETR checkpoints converted to be used with 🤗 Transformers • 15 items • Updated May 27 • 17
🧬 Carbon Collection Carbon 500M, 3B, 8B genomic models and GGUF variants for llama.cpp • 7 items • Updated 28 days ago • 43
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency not-lain • Jan 30, 2025 • 356
view article Article Unlocking asynchronicity in continuous batching +1 ror, pcuenq, ariG23498 • May 14 • 61