Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

reinforcement-learning

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

74,730

Base only

Active filters: reinforcement-learning

Chunjiang-Intelligence/DeepSeek-v4-Fable

Text Generation • 149B • Updated 5 days ago • 1.33k • 117

di-zhang-fdu/openfugu-conductor-3b

Text Generation • 3B • Updated 6 days ago • 34 • 7

Shadow0482/mythos_fast

Reinforcement Learning • 3B • Updated 10 days ago • 1.24k • 6

OpenMOSS-Team/MOSS-Transcribe-preview-2B

Automatic Speech Recognition • 2B • Updated 2 days ago • 5 • 6

Adilbai/stock-trading-rl-agent

Reinforcement Learning • Updated Jan 8 • 171 • 159

nvidia/NitroGen

Reinforcement Learning • Updated Feb 5 • 545

InternScience/Agents-K1

Text Generation • 4B • Updated 16 days ago • 643 • 20

PhysicsWallahAI/Aryabhata-2.0

Text Generation • 21B • Updated 25 days ago • 274 • 5

MooreThreads/MusaCoder-27B

Reinforcement Learning • 3.05M • Updated 18 days ago • 881 • 45

Tesleum/shirdel-coder-9b-claude-fable-5

Reinforcement Learning • 9B • Updated about 10 hours ago • 1.17k • 2

mradermacher/Tifa-Deepsex-14b-CoT-GGUF

Reinforcement Learning • 15B • Updated Jul 31, 2025 • 321 • 24

TianheWu/VisualQuality-R1-7B

Reinforcement Learning • 8B • Updated Sep 19, 2025 • 4.85k • 14

zai-org/GLM-TTS

Text-to-Speech • Updated Jan 12 • 246 • 342

MYTH-Lab/GoT-R1-4B

Text Generation • 4B • Updated about 1 month ago • 6 • 2

MYTH-Lab/GoT-R1-14B

Text Generation • 15B • Updated about 1 month ago • 6 • 1

mradermacher/GoT-R1-4B-GGUF

Text Generation • 4B • Updated 2 days ago • 211 • 2

exla-ai/openpie-0.6

Robotics • Updated Feb 4 • 79 • 26

SolarSys2026/EnergyTrading

Reinforcement Learning • Updated Feb 8 • 1

XunmeiLiu/VFIG-4B

Reinforcement Learning • 4B • Updated Mar 27 • 105 • 7

Nalandadata/nalanda-qwen-7b-grpo

Text Generation • 8B • Updated 1 day ago • 95 • 2

Falconss1/VideoThinker-R1-3B

Video-Text-to-Text • 4B • Updated May 5 • 8 • 2

Mercury7353/MetaAgent-X

Reinforcement Learning • 8B • Updated May 15 • 86 • 6

zghhui/OmniNFT

Any-to-Any • Updated May 19 • 41

6kplus/PhyMotion-CausalForcing-1.3B

Text-to-Video • Updated May 16 • 5

11-47/GODs.Ghost.Codex.VII

Text Generation • 1B • Updated 7 days ago • 1

AnvaMiba/qwen3-8b-bargaining-lora

Text Generation • Updated 27 days ago • 1

poolside-laguna-hackathon/protein-ligand-design

Text Generation • Updated 29 days ago • 44 • 1

mims-harvard/ATHENA-R1-Qwen3-8B

Text Generation • 8B • Updated 12 days ago • 38 • 2

11-47/Sentience.Cascade.II

Text Generation • 1B • Updated 7 days ago • 15 • 1

sanju-1007/SpaceInvadersNoFrameskip-v4

Reinforcement Learning • Updated 4 days ago • 65 • 1