-
-
-
-
-
-
Inference Providers
Active filters:
RL
15B
•
Updated
•
5
8B
•
Updated
mradermacher/DiagAgent-7B-GGUF
8B
•
Updated
•
154
mradermacher/Austral-70B-Winton-GGUF
71B
•
Updated
•
69
mradermacher/Austral-70B-Winton-i1-GGUF
71B
•
Updated
•
223
HYDARIM7/SmolLM2_RLHF_PPO_HY
Reinforcement Learning
•
0.1B
•
Updated
SII-Enigma/Qwen2.5-7B-Ins-AMPO
Text Generation
•
2B
•
Updated
•
2
SII-Enigma/Qwen2.5-7B-Ins-SFT-GRPO
Text Generation
•
2B
•
Updated
•
3
SII-Enigma/Llama3.2-8B-Ins-GRPO
Text Generation
•
2B
•
Updated
•
1
•
1
mradermacher/Llama3.2-8B-Ins-GRPO-GGUF
8B
•
Updated
•
92
•
1
SII-Enigma/Qwen2.5-7B-Ins-SFT-AMPO
Text Generation
•
8B
•
Updated
•
1
SII-Enigma/Qwen2.5-7B-Ins-GRPO
Text Generation
•
2B
•
Updated
SII-Enigma/Qwen2.5-1.5B-Ins-AMPO
Text Generation
•
2B
•
Updated
•
3
SII-Enigma/Llama3.2-8B-Ins-AMPO
Text Generation
•
8B
•
Updated
•
2
SII-Enigma/Qwen2.5-1.5B-Ins-GRPO
Text Generation
•
2B
•
Updated
Text Generation
•
2B
•
Updated
•
6
mradermacher/GCPO-R1-1.5B-GGUF
2B
•
Updated
•
31
mradermacher/GCPO-R1-1.5B-i1-GGUF
2B
•
Updated
•
88
mradermacher/DeepHermes-Egregore-8B-131K-GGUF
Reinforcement Learning
•
8B
•
Updated
•
62
•
1
mradermacher/DeepHermes-Egregore-8B-131K-i1-GGUF
Reinforcement Learning
•
8B
•
Updated
•
5.57k
•
1
stephenchungmh/thinker_r1_5b
2B
•
Updated
•
1
•
1
stephenchungmh/thinker_q1_5b
2B
•
Updated
•
1
stephenchungmh/thinker_r7b
8B
•
Updated
•
1
•
1
8B
•
Updated
•
1
•
1
mradermacher/RENT-Qwen-7B-GGUF
8B
•
Updated
•
80
•
1
mradermacher/RENT-Qwen-7B-i1-GGUF
8B
•
Updated
•
86
•
1
beyoru/MinCoder-4B-Expert
Text Generation
•
4B
•
Updated
•
3
•
1
mradermacher/MinCoder-4B-Expert-GGUF
4B
•
Updated
•
93
•
2
mradermacher/MinCoder-4B-Expert-i1-GGUF
4B
•
Updated
•
76
•
1
Text Generation
•
4B
•
Updated
•
1