-
-
-
-
-
-
Inference Providers
Active filters:
ppo
ajagota71/pythia-410m-detox-irl-rlhf-seed-400
Reinforcement Learning
•
0.4B
•
Updated
•
1
S-Chaves/ppo-from-scratch-LunarLander-v2
Reinforcement Learning
•
Updated
Arrebol-yzq/RLP_llm_inductive_model
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-20
Reinforcement Learning
•
70.4M
•
Updated
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-40
Reinforcement Learning
•
70.4M
•
Updated
•
1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-60
Reinforcement Learning
•
70.4M
•
Updated
•
1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-80
Reinforcement Learning
•
70.4M
•
Updated
•
1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-100
Reinforcement Learning
•
70.4M
•
Updated
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-120
Reinforcement Learning
•
70.4M
•
Updated
•
1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-140
Reinforcement Learning
•
70.4M
•
Updated
•
1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-160
Reinforcement Learning
•
70.4M
•
Updated
•
1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-180
Reinforcement Learning
•
70.4M
•
Updated
•
1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-200
Reinforcement Learning
•
70.4M
•
Updated
•
1
ajagota71/pythia-70m-fb-detox
Reinforcement Learning
•
70.4M
•
Updated
•
1
ajagota71/pythia-160m-fb-detox-checkpoint-epoch-20
Reinforcement Learning
•
0.2B
•
Updated
•
1
ajagota71/pythia-160m-fb-detox-checkpoint-epoch-60
Reinforcement Learning
•
0.2B
•
Updated
•
1
ajagota71/pythia-160m-fb-detox-checkpoint-epoch-80
Reinforcement Learning
•
0.2B
•
Updated
•
1
ajagota71/pythia-160m-fb-detox-checkpoint-epoch-100
Reinforcement Learning
•
0.2B
•
Updated
•
1
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-20
Reinforcement Learning
•
0.4B
•
Updated
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-40
Reinforcement Learning
•
0.4B
•
Updated
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-60
Reinforcement Learning
•
0.4B
•
Updated
•
1
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-80
Reinforcement Learning
•
0.4B
•
Updated
•
1
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-100
Reinforcement Learning
•
0.4B
•
Updated
•
1
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-120
Reinforcement Learning
•
0.4B
•
Updated
•
1
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-140
Reinforcement Learning
•
0.4B
•
Updated
•
1
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-160
Reinforcement Learning
•
0.4B
•
Updated
•
1
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-180
Reinforcement Learning
•
0.4B
•
Updated
•
1
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-200
Reinforcement Learning
•
0.4B
•
Updated
•
1
ajagota71/pythia-410m-fb-detox
Reinforcement Learning
•
0.4B
•
Updated