Inference Providers
Active filters: ppo
Nack34/ppo-from-scratch-LunarLander-v2
Reinforcement Learning
• Updated Reinforcement Learning
• Updated Ari8/ppo-LunarLander-v2_unit8
Reinforcement Learning
• Updated AndreiVoicuT/ppo-LunarLander-v2-C8
Reinforcement Learning
• Updated • 1
alejandroajhr/ppo-LunarLander-v2-unit8
Reinforcement Learning
• Updated ajagota71/pythia-70m-detox-irl-rlhf-test
Reinforcement Learning
• 70.4M • Updated • 1
rusuanjun/ppo-selfimplement-LunarLander-v2
Reinforcement Learning
• Updated Reinforcement Learning
• Updated aalva/ppo-cleanrl-LunarLander-v2
Reinforcement Learning
• Updated ajagota71/pythia-70m-detox-irl-rlhf-test-facebook-filter
Reinforcement Learning
• 70.4M • Updated ajagota71/pythia-70m-detox-irl-rlhf-test2
Reinforcement Learning
• 70.4M • Updated • 1
ajagota71/pythia-70m-detox-raw-logits-test2
Reinforcement Learning
• 70.4M • Updated • 1
ajagota71/pythia-160m-detox-raw-logits-test2
Reinforcement Learning
• 0.2B • Updated ajagota71/pythia-70m-detox-irl-rlhf-seed-42
Reinforcement Learning
• 70.4M • Updated • 1
ajagota71/pythia-70m-detox-irl-rlhf-seed-200
Reinforcement Learning
• 70.4M • Updated DeepMostInnovations/sales-conversion-model-reinf-learning
Reinforcement Learning
• Updated • 6
• 33
Reinforcement Learning
• Updated ajagota71/pythia-410m-detox-irl-rlhf-seed-42
Reinforcement Learning
• 0.4B • Updated ajagota71/pythia-410m-detox-irl-rlhf-seed-100
Reinforcement Learning
• 0.4B • Updated • 2
ajagota71/pythia-410m-detox-irl-rlhf-seed-200
Reinforcement Learning
• 0.4B • Updated ajagota71/pythia-410m-detox-irl-rlhf-seed-300
Reinforcement Learning
• 0.4B • Updated ajagota71/pythia-410m-detox-irl-rlhf-seed-400
Reinforcement Learning
• 0.4B • Updated S-Chaves/ppo-from-scratch-LunarLander-v2
Reinforcement Learning
• Updated Arrebol-yzq/RLP_llm_inductive_model
Reinforcement Learning
• Updated Reinforcement Learning
• Updated ajagota71/pythia-70m-fb-detox-checkpoint-epoch-20
Reinforcement Learning
• 70.4M • Updated ajagota71/pythia-70m-fb-detox-checkpoint-epoch-40
Reinforcement Learning
• 70.4M • Updated ajagota71/pythia-70m-fb-detox-checkpoint-epoch-60
Reinforcement Learning
• 70.4M • Updated • 1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-80
Reinforcement Learning
• 70.4M • Updated • 1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-100
Reinforcement Learning
• 70.4M • Updated • 1