Inference Providers
Active filters: rl
Text Generation
• 4B • Updated • 1
Text Generation
• 4B • Updated Text Generation
• 4B • Updated • 2
Text Generation
• 4B • Updated • 1
Text Generation
• 4B • Updated Text Generation
• 4B • Updated • 3
Text Generation
• 4B • Updated • 2
HarleyCooper/Qwen3-30B-Dakota1890
Text Generation
• Updated • 2
• 2
HerrHruby/offline_acemath_rl_4b_inst_hard_with_dishsoap_16k_no_summ_curr_step_120
Text Generation
• 4B • Updated • 2
HarleyCooper/Qwen3-30B-ThinkingMachines-Dakota1890
Reinforcement Learning
• Updated • 3
Text Generation
• 21B • Updated • 4
mradermacher/CAI-20B-v2-GGUF
Text Generation
• 21B • Updated • 23
mradermacher/CAI-20B-v2-i1-GGUF
Text Generation
• 21B • Updated • 56
socaitcy/SOCAIT-Hermes-14B
Text Generation
• Updated ash256/qwen3-4b-question-gen
Text Generation
• 4B • Updated • 3
• • 1
pankajmathur/nanochat-d34-rl-all-ckpts
Text Generation
• Updated • 1
pankajmathur/nanochat-d34-rl
Text Generation
• Updated pankajmathur/RenCoder-Devstral-Small-2507
Text Generation
• 24B • Updated • 21
• 1
HallD/SkeptiSTEM-4B-v2-stageR3-grpo-lora
Text Generation
• Updated • 2
anakin87/LFM2-2.6B-ttt-rl
Text Generation
• Updated • 3
anakin87/LFM2-2.6B-ttt-rl-merged
Text Generation
• 3B • Updated • 2
Any-to-Any
• 7B • Updated • 11
ModalityDance/Omni-R1-Zero
Any-to-Any
• 7B • Updated • 12
ibrahima2222/nanochat-d32
Updated
IIGroup/X-Coder-RL-Qwen2.5-7B
8B • Updated • 53
• 1
IIGroup/X-Coder-RL-Qwen3-8B
8B • Updated • 4
• 1
mradermacher/X-Coder-RL-Qwen3-8B-GGUF
8B • Updated • 244
• 1
mradermacher/X-Coder-RL-Qwen2.5-7B-GGUF
8B • Updated • 297
mradermacher/X-Coder-RL-Qwen3-8B-i1-GGUF
8B • Updated • 438
• 2
mradermacher/X-Coder-RL-Qwen2.5-7B-i1-GGUF
8B • Updated • 396