Active filters: ppo
Reinforcement Learning
• Updated Reinforcement Learning
• Updated LichengLiu03/Qwen2.5-3B-UFO
Text Generation
• 3B • Updated • 3
• • 2
rllapin28/ppo-CartPole-v1
Reinforcement Learning
• Updated carolinacon/ppo-CartPole-v1
Reinforcement Learning
• Updated LichengLiu03/Qwen2.5-3B-UFO-1turn
Text Generation
• 3B • Updated • 1
• 2
ajagota71/pythia-70m-s-nlp-detox-checkpoint-epoch-20
Reinforcement Learning
• 70.4M • Updated • 1
ajagota71/pythia-70m-s-nlp-detox-checkpoint-epoch-40
Reinforcement Learning
• 70.4M • Updated • 1
ajagota71/pythia-70m-s-nlp-detox-checkpoint-epoch-60
Reinforcement Learning
• 70.4M • Updated • 1
ajagota71/pythia-70m-s-nlp-detox-checkpoint-epoch-80
Reinforcement Learning
• 70.4M • Updated • 1
ajagota71/pythia-70m-s-nlp-detox-checkpoint-epoch-100
Reinforcement Learning
• 70.4M • Updated ajagota71/pythia-70m-s-nlp-detox
Reinforcement Learning
• 70.4M • Updated • 1
JulioSnchezD/ppo-CartPole-v1
Reinforcement Learning
• Updated Reinforcement Learning
• Updated mradermacher/Qwen2.5-3B-UFO-GGUF
3B • Updated • 28
• 1
mradermacher/Qwen2.5-3B-UFO-1turn-GGUF
3B • Updated • 36
• 1
ajagota71/pythia-410m-s-nlp-detox-checkpoint-epoch-20
Reinforcement Learning
• 0.4B • Updated • 1
ajagota71/pythia-410m-s-nlp-detox-checkpoint-epoch-40
Reinforcement Learning
• 0.4B • Updated ajagota71/pythia-410m-s-nlp-detox-checkpoint-epoch-60
Reinforcement Learning
• 0.4B • Updated • 2
ajagota71/pythia-410m-s-nlp-detox-checkpoint-epoch-80
Reinforcement Learning
• 0.4B • Updated • 1
ajagota71/pythia-410m-s-nlp-detox-checkpoint-epoch-100
Reinforcement Learning
• 0.4B • Updated • 1
ajagota71/pythia-410m-s-nlp-detox
Reinforcement Learning
• 0.4B • Updated • 1
ajagota71/pythia-1b-s-nlp-detox-checkpoint-epoch-20
Reinforcement Learning
• 1B • Updated • 1
ajagota71/pythia-1b-s-nlp-detox-checkpoint-epoch-40
Reinforcement Learning
• 1B • Updated • 1
ajagota71/pythia-1b-s-nlp-detox-checkpoint-epoch-60
Reinforcement Learning
• 1B • Updated • 1
ajagota71/pythia-1b-s-nlp-detox-checkpoint-epoch-80
Reinforcement Learning
• 1B • Updated • 1
ajagota71/pythia-1b-s-nlp-detox-checkpoint-epoch-100
Reinforcement Learning
• 1B • Updated • 1
ajagota71/pythia-1b-s-nlp-detox
Reinforcement Learning
• 1B • Updated • 2
ajagota71/llama-3-2-1b-s-nlp-detox-checkpoint-epoch-20
Reinforcement Learning
• 1B • Updated • 1
ajagota71/llama-3-2-1b-s-nlp-detox-checkpoint-epoch-40
Reinforcement Learning
• 1B • Updated • 1