Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

480

Base only

Active filters: rl

caiyuchen/Spiral-step-16

Text Generation • 4B • Updated Nov 15, 2025 • 3

caiyuchen/Spiral-step-18

Text Generation • 4B • Updated Nov 15, 2025 • 3

caiyuchen/Spiral-step-17

Text Generation • 4B • Updated Nov 15, 2025 • 2

caiyuchen/Spiral-step-20

Text Generation • 4B • Updated Nov 15, 2025 • 5

caiyuchen/Spiral-step-19

Text Generation • 4B • Updated Nov 15, 2025 • 3

caiyuchen/Spiral-step-22

Text Generation • 4B • Updated Nov 15, 2025 • 2

caiyuchen/Spiral-step-21

Text Generation • 4B • Updated Nov 15, 2025 • 3

HarleyCooper/Qwen3-30B-Dakota1890

Text Generation • Updated Nov 23, 2025 • 7 • 2

HerrHruby/offline_acemath_rl_4b_inst_hard_with_dishsoap_16k_no_summ_curr_step_120

Text Generation • 4B • Updated Feb 13 • 15

HarleyCooper/Qwen3-30B-ThinkingMachines-Dakota1890

Reinforcement Learning • Updated Nov 23, 2025 • 4

mradermacher/CAI-20B-v2-GGUF

Text Generation • 21B • Updated Dec 1, 2025 • 93

mradermacher/CAI-20B-v2-i1-GGUF

Text Generation • 21B • Updated Dec 4, 2025 • 134

socaitcy/SOCAIT-Hermes-14B

Text Generation • Updated Dec 4, 2025 • 10

ash256/qwen3-4b-question-gen

Text Generation • 4B • Updated Dec 7, 2025 • 22 • • 1

pankajmathur/nanochat-d34-rl-all-ckpts

Text Generation • Updated Dec 9, 2025 • 1

pankajmathur/nanochat-d34-rl

Text Generation • Updated Dec 9, 2025

pankajmathur/RenCoder-Devstral-Small-2507

Text Generation • 24B • Updated Apr 10 • 18 • 1

HallD/SkeptiSTEM-4B-v2-stageR3-grpo-lora

Text Generation • Updated Jan 4 • 1

anakin87/LFM2-2.6B-ttt-rl

Text Generation • Updated Apr 5 • 3

anakin87/LFM2-2.6B-ttt-rl-merged

Text Generation • 3B • Updated Apr 5 • 4

ModalityDance/Omni-R1

Any-to-Any • 7B • Updated Jan 21 • 4

ModalityDance/Omni-R1-Zero

Any-to-Any • 7B • Updated Jan 21 • 1

ibrahima2222/nanochat-d32

IIGroup/X-Coder-RL-Qwen2.5-7B

8B • Updated Jan 13 • 10 • 1

IIGroup/X-Coder-RL-Qwen3-8B

8B • Updated Jan 13 • 15 • 1

mradermacher/X-Coder-RL-Qwen3-8B-GGUF

8B • Updated Jan 11 • 119 • 1

mradermacher/X-Coder-RL-Qwen2.5-7B-GGUF

8B • Updated Jan 11 • 48

mradermacher/X-Coder-RL-Qwen3-8B-i1-GGUF

8B • Updated Jan 11 • 583 • 2

mradermacher/X-Coder-RL-Qwen2.5-7B-i1-GGUF

8B • Updated Jan 11 • 250

anakin87/LFM2-2.6B-ttt-rl-2

Text Generation • Updated Apr 5 • 2