-
-
-
-
-
-
Inference Providers
Active filters:
ppo
AIventurer/ppo-CartPole-v1
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
AriYusa/ppo-implementation
Reinforcement Learning
•
Updated
volfy/huggingface_rl_unit8_ppo-CartPole-v1
Reinforcement Learning
•
Updated
volfy/huggingface_rl_unit8_ppo-LunarLander-v3
Reinforcement Learning
•
Updated
MartinRedWhite/unit8-LunarLander-v2-unit8
Reinforcement Learning
•
Updated
volfy/huggingface_rl_unit8_ppo-LunarLander-v2
Reinforcement Learning
•
Updated
Vanheart/ppoCRL-LunarLander-v2
Reinforcement Learning
•
Updated
JuanjoGT13/ppo-CartPole-v1
Reinforcement Learning
•
Updated
amostof/ppoScratch-LunarLander-v2
Reinforcement Learning
•
Updated
twofacejr/ppo-CartPole-v1
Reinforcement Learning
•
Updated
vinhdq842/ppo-LunarLander-v2-scratch
Reinforcement Learning
•
Updated
Jennny/llama3_samsum_rl_marshal
Reinforcement Learning
•
8B
•
Updated
•
1
Jennny/llama3_dialogsum_rl_marshal
Reinforcement Learning
•
8B
•
Updated
•
1
francescosabbarese/ppo-CartPole-v1
Reinforcement Learning
•
Updated
francescosabbarese/ppo-LunarLander-v2-unit8-pt1
Reinforcement Learning
•
Updated
nasnoussi/ppo-CartPole-v1
Reinforcement Learning
•
Updated
takedakoji00/Llama-3.1-8B-Instruct-custom-qg-full_20250219-7th_random_pad_is_eos_test
Reinforcement Learning
•
Updated
•
1
baronase/ppo-cleanrl-CartPole-v1
Reinforcement Learning
•
Updated
baronase/ppo-cleanrl-CartPole-v1_2
Reinforcement Learning
•
Updated
baronase/ppo-cleanrl-LunarLander-v2_1
Reinforcement Learning
•
Updated
baronase/ppo-cleanrl-LunarLander-v2_200k
Reinforcement Learning
•
Updated
takedakoji00/Llama-3.1-8B-Instruct-custom-qg-full_20250219-7th_random_pad_is_eos_ppo_2nd
Reinforcement Learning
•
Updated
takedakoji00/Llama-3.1-8B-Instruct-custom-qg-full_20250219-7th_random_pad_is_eos_offline_nav
Reinforcement Learning
•
5B
•
Updated
•
9
Jennny/llama3_samsum_marl_wo_comm
Reinforcement Learning
•
8B
•
Updated
Jennny/llama3_dialogsum_marl_wo_comm
Reinforcement Learning
•
8B
•
Updated
•
1
lucas-palmiro/ppo-LunarLander-v3
Reinforcement Learning
•
Updated
lucas-palmiro/ppo-early-stopping-LunarLander-v3
Reinforcement Learning
•
Updated
sighmon/ppo-cleanrl-LunarLander-v2
Reinforcement Learning
•
Updated
mrinaldi86/ppo-CartPole-v1
Reinforcement Learning
•
Updated