DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22, 2025 • 449
Runtime error Agents Featured 136 Qwen3-ASR Demo 🎙 136 Transcribe audio to text with multi-language timestamps
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 231
Runtime error Agents Featured 1.92k Qwen3-TTS Demo 🎙 1.92k Generate speech audio from text with custom or cloned voices
Running on CPU Upgrade Featured 3.16k The Smol Training Playbook 📚 3.16k The secrets to building world-class LLMs
Sleeping Agents 1 Dino Tooth Identifier 🦀 1 Identify the dinosaur genus from a photo of a fossil tooth.
Sleeping Agents 1 Dino Tooth Identifier 🦀 1 Identify the dinosaur genus from a photo of a fossil tooth.
Sleeping Agents 1 Dino Tooth Identifier 🦀 1 Identify the dinosaur genus from a photo of a fossil tooth.
h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b-preview-700bt Text Generation • Updated May 24, 2023 • 19 • 4