-
-
-
-
-
-
Inference Providers
Active filters:
rl
Text Generation
•
4B
•
Updated
•
1
Text Generation
•
4B
•
Updated
•
1
Text Generation
•
4B
•
Updated
•
1
Text Generation
•
4B
•
Updated
•
1
Text Generation
•
4B
•
Updated
•
2
Text Generation
•
4B
•
Updated
•
1
HarleyCooper/Qwen3-30B-Dakota1890
Text Generation
•
Updated
•
1
•
2
HarleyCooper/Qwen3-30B-ThinkingMachines-Dakota1890
Reinforcement Learning
•
Updated
Text Generation
•
21B
•
Updated
•
22
mradermacher/CAI-20B-v2-GGUF
Text Generation
•
21B
•
Updated
•
44
mradermacher/CAI-20B-v2-i1-GGUF
Text Generation
•
21B
•
Updated
•
150
socaitcy/SOCAIT-Hermes-14B
Text Generation
•
Updated
ash256/qwen3-4b-question-gen
Text Generation
•
4B
•
Updated
•
4
•
1
pankajmathur/nanochat-d34-rl-all-ckpts
Text Generation
•
Updated
•
1
pankajmathur/nanochat-d34-rl
Text Generation
•
Updated
HallD/SkeptiSTEM-4B-v2-stageR3-grpo-lora
Text Generation
•
Updated
Any-to-Any
•
7B
•
Updated
•
1.87k
ModalityDance/Omni-R1-Zero
Any-to-Any
•
7B
•
Updated
•
35
ibrahima2222/nanochat-d32
Updated
IIGroup/X-Coder-RL-Qwen2.5-7B
8B
•
Updated
•
101
•
1
IIGroup/X-Coder-RL-Qwen3-8B
8B
•
Updated
•
97
•
1
mradermacher/X-Coder-RL-Qwen3-8B-GGUF
8B
•
Updated
•
539
mradermacher/X-Coder-RL-Qwen2.5-7B-GGUF
8B
•
Updated
•
305
mradermacher/X-Coder-RL-Qwen3-8B-i1-GGUF
8B
•
Updated
•
1.38k
mradermacher/X-Coder-RL-Qwen2.5-7B-i1-GGUF
8B
•
Updated
•
750
mradermacher/Omni-R1-Zero-GGUF
7B
•
Updated
•
531
mradermacher/Omni-R1-GGUF
7B
•
Updated
•
683
mradermacher/Omni-R1-Zero-i1-GGUF
7B
•
Updated
•
1.41k
mradermacher/Omni-R1-i1-GGUF
7B
•
Updated
•
2k