-
-
-
-
-
-
Inference Providers
Active filters:
ppo
mrinaldi86/ppo-LunarLander-v3
Reinforcement Learning
•
Updated
takedakoji00/Llama-3.1-8B-Instruct-custom-qg-full_20250219-7th_random_pad_is_eos_offline_nav_2nd
Reinforcement Learning
•
5B
•
Updated
takedakoji00/Llama-3.1-8B-Instruct-custom-qg-full_20250219-7th_random_pad_is_eos_ppo_3rd
Reinforcement Learning
•
Updated
nasnoussi/ppo-Pixelcopter-v1
Reinforcement Learning
•
Updated
dragovoid/ppo-LunarLander-v2-u8
Reinforcement Learning
•
Updated
amostof/ppoScratchTest-LunarLander-v2
Reinforcement Learning
•
Updated
fangyima/cleanrl-ppo-LunarLander-v2
Reinforcement Learning
•
Updated
faelwen/ppo-LunarLander-v2-scratch
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
Khushal31/ppo-Unit8-LunarLander-v2
Reinforcement Learning
•
Updated
suneater175/CleanRL-LunarLander-v2
Reinforcement Learning
•
Updated
zhangtemplar/LunarLander-v2-newppo
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
pdimas/helpfulpharmacyllm_js-rlhf-01
Reinforcement Learning
•
1B
•
Updated
•
4
pdimas/helpfulpharmacyllm_mb-rlhf-01
Reinforcement Learning
•
1B
•
Updated
•
4
Reinforcement Learning
•
Updated
udonhef2bmad/U8P1-ppo-LunarLander-v2
Reinforcement Learning
•
Updated
jonathansculley/ppo-LunarLander-v3
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
tmoroder/manual-ppo-LunarLander-v2
Reinforcement Learning
•
Updated
nossie0360/clean-ppo-LunarLander-v2
Reinforcement Learning
•
Updated
AntonVoronko/ppo-fs-LunarLander-v2
Reinforcement Learning
•
Updated
ALEXIOSTER/ppo-CartPole-v1
Reinforcement Learning
•
Updated
ALEXIOSTER/ppo-LunarLander-v2
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated