-
-
-
-
-
-
Inference Providers
Active filters:
ppo
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
westy412/ppo-LunarLander-v1-u8
Reinforcement Learning
•
Updated
jlse/ppo-LunarLander-v2-u8
Reinforcement Learning
•
Updated
ajagota71/pythia-70m-detox-test
Reinforcement Learning
•
70.4M
•
Updated
•
1
Momin-Shahzad/ppo-CartPole-v1
Reinforcement Learning
•
Updated
ajagota71/pythia-70m-detox-raw-logits
Reinforcement Learning
•
70.4M
•
Updated
•
1
Momin-Shahzad/LunarLander-v2
Reinforcement Learning
•
Updated
Nack34/ppo-from-scratch-LunarLander-v2
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
Ari8/ppo-LunarLander-v2_unit8
Reinforcement Learning
•
Updated
AndreiVoicuT/ppo-LunarLander-v2-C8
Reinforcement Learning
•
Updated
alejandroajhr/ppo-LunarLander-v2-unit8
Reinforcement Learning
•
Updated
ajagota71/pythia-70m-detox-irl-rlhf-test
Reinforcement Learning
•
70.4M
•
Updated
•
1
rusuanjun/ppo-selfimplement-LunarLander-v2
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
aalva/ppo-cleanrl-LunarLander-v2
Reinforcement Learning
•
Updated
ajagota71/pythia-70m-detox-irl-rlhf-test-facebook-filter
Reinforcement Learning
•
70.4M
•
Updated
•
1
ajagota71/pythia-70m-detox-irl-rlhf-test2
Reinforcement Learning
•
70.4M
•
Updated
•
1
ajagota71/pythia-70m-detox-raw-logits-test2
Reinforcement Learning
•
70.4M
•
Updated
•
1
ajagota71/pythia-160m-detox-raw-logits-test2
Reinforcement Learning
•
0.2B
•
Updated
ajagota71/pythia-70m-detox-irl-rlhf-seed-42
Reinforcement Learning
•
70.4M
•
Updated
•
1
ajagota71/pythia-70m-detox-irl-rlhf-seed-200
Reinforcement Learning
•
70.4M
•
Updated
•
1
DeepMostInnovations/sales-conversion-model-reinf-learning
Reinforcement Learning
•
Updated
•
11
•
33
Reinforcement Learning
•
Updated
ajagota71/pythia-410m-detox-irl-rlhf-seed-42
Reinforcement Learning
•
0.4B
•
Updated
•
1
ajagota71/pythia-410m-detox-irl-rlhf-seed-100
Reinforcement Learning
•
0.4B
•
Updated
•
1
ajagota71/pythia-410m-detox-irl-rlhf-seed-200
Reinforcement Learning
•
0.4B
•
Updated
•
1
ajagota71/pythia-410m-detox-irl-rlhf-seed-300
Reinforcement Learning
•
0.4B
•
Updated
•
1