Active filters: ppo
CatkinChen/nethack-ppo-ablation-no_hmm_rnd
Reinforcement Learning
• Updated CatkinChen/nethack-ppo-ablation-baseline_curiosity_dyn_only
Reinforcement Learning
• Updated joigalcar/ppo-LunarLander-v2_Scratch
Reinforcement Learning
• Updated joigalcar/ppo-LunarLander-v2_Scratch_2
Reinforcement Learning
• Updated rishiad/kinitro-metaworld-agent
Reinforcement Learning
• Updated CatkinChen/nethack-ppo-ablation-baseline_rnd
Reinforcement Learning
• Updated CatkinChen/nethack-ppo-ablation-baseline_curiosity_skill_only
Reinforcement Learning
• Updated CatkinChen/nethack-ppo-ablation-baseline_curiosity_trans_only
Reinforcement Learning
• Updated OxoGhost/ppo-LunarLander-v2-PPO
Reinforcement Learning
• Updated WillLedd/ppoCartPoleFromScratch
Reinforcement Learning
• Updated nabeelshan/rlhf-gpt2-pipeline
Text Generation
• Updated CatkinChen/nethack-ppo-ablation-baseline_full_curiosity
Reinforcement Learning
• Updated WillLedd/PPO-CleanRL-LunarLander-v2
Reinforcement Learning
• Updated tstenborg/unit8-LunarLander-v2
Reinforcement Learning
• Updated timflash/ppo-LunarLander-v2
Reinforcement Learning
• Updated forgedRice/ppo-LunarLander-v2
Reinforcement Learning
• Updated forgedRice/drl-course-unit-01-lunar-lander-v2
Reinforcement Learning
• Updated user05181824/ppo-LunarLander-v3
Reinforcement Learning
• Updated Text Generation
• 0.1B • Updated • 2
• 1
ricardo-teixeira9/ppo-LunarLander-v2_unit8
Reinforcement Learning
• Updated Bavantha11/LunarLander-v2-unit8
Reinforcement Learning
• Updated Vibudhbh/gpt2-rlhf-implementation
Text Generation
• 0.1B • Updated • 5
ginnigarg/ginni-ppo-LunarLander-v2
Reinforcement Learning
• Updated mradermacher/gpt2-rlhf-implementation-GGUF
0.1B • Updated • 120
chenyu0x00/ppo-unit8-LunarLander-v2
Reinforcement Learning
• Updated Sharath-25/ppo-from-scratch
Reinforcement Learning
• Updated granenko/ppo-LunarLander-v3
Reinforcement Learning
• Updated MrOceanMan/ppo-LunarLander-v2
Reinforcement Learning
• Updated Reinforcement Learning
• Updated Aubins/CustomPPO-LunarLander-v2
Reinforcement Learning
• Updated