katharsis
/

carv1-ppo

Reinforcement Learning

stable-baselines3

Eval Results (legacy)

Model card Files Files and versions

katharsis commited on Jan 4

Commit

082acdf

·

verified ·

1 Parent(s): 5465f29

Update model card

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ model-index:
       type: reinforcement-learning
     metrics:
     - type: mean_reward
-      value: 806.98 +/- 0.00
       name: Mean Reward
 ---
@@ -24,8 +24,8 @@ This is a trained PPO agent for the CarV1 line-following environment.
 - **Algorithm**: PPO (Proximal Policy Optimization)
 - **Framework**: Stable-Baselines3
-- **Training Timesteps**: 49,920
-- **Mean Reward**: 806.98 ± 0.00
 - **Training Date**: 2026-01-04
 ## Usage

       type: reinforcement-learning
     metrics:
     - type: mean_reward
+      value: 807.10 +/- 0.00
       name: Mean Reward
 ---
 - **Algorithm**: PPO (Proximal Policy Optimization)
 - **Framework**: Stable-Baselines3
+- **Training Timesteps**: 99,840
+- **Mean Reward**: 807.10 ± 0.00
 - **Training Date**: 2026-01-04
 ## Usage