Update model card
Browse files
README.md
CHANGED
|
@@ -12,7 +12,7 @@ model-index:
|
|
| 12 |
type: reinforcement-learning
|
| 13 |
metrics:
|
| 14 |
- type: mean_reward
|
| 15 |
-
value:
|
| 16 |
name: Mean Reward
|
| 17 |
---
|
| 18 |
|
|
@@ -24,8 +24,8 @@ This is a trained PPO agent for the CarV1 line-following environment.
|
|
| 24 |
|
| 25 |
- **Algorithm**: PPO (Proximal Policy Optimization)
|
| 26 |
- **Framework**: Stable-Baselines3
|
| 27 |
-
- **Training Timesteps**:
|
| 28 |
-
- **Mean Reward**:
|
| 29 |
- **Training Date**: 2026-01-04
|
| 30 |
|
| 31 |
## Usage
|
|
|
|
| 12 |
type: reinforcement-learning
|
| 13 |
metrics:
|
| 14 |
- type: mean_reward
|
| 15 |
+
value: 807.10 +/- 0.00
|
| 16 |
name: Mean Reward
|
| 17 |
---
|
| 18 |
|
|
|
|
| 24 |
|
| 25 |
- **Algorithm**: PPO (Proximal Policy Optimization)
|
| 26 |
- **Framework**: Stable-Baselines3
|
| 27 |
+
- **Training Timesteps**: 99,840
|
| 28 |
+
- **Mean Reward**: 807.10 ± 0.00
|
| 29 |
- **Training Date**: 2026-01-04
|
| 30 |
|
| 31 |
## Usage
|