First Push (Colab small run)
Browse files- README.md +3 -3
- replay.mp4 +0 -0
- results.json +3 -3
README.md
CHANGED
|
@@ -17,7 +17,7 @@ model-index:
|
|
| 17 |
type: CartPole-v1
|
| 18 |
metrics:
|
| 19 |
- type: mean_reward
|
| 20 |
-
value:
|
| 21 |
name: mean_reward
|
| 22 |
verified: false
|
| 23 |
---
|
|
@@ -27,5 +27,5 @@ model-index:
|
|
| 27 |
Trained with a minimal CleanRL-style PPO implementation in Google Colab.
|
| 28 |
|
| 29 |
## Results
|
| 30 |
-
- Mean reward: **
|
| 31 |
-
- Std reward: **
|
|
|
|
| 17 |
type: CartPole-v1
|
| 18 |
metrics:
|
| 19 |
- type: mean_reward
|
| 20 |
+
value: 83.60 +/- 50.09
|
| 21 |
name: mean_reward
|
| 22 |
verified: false
|
| 23 |
---
|
|
|
|
| 27 |
Trained with a minimal CleanRL-style PPO implementation in Google Colab.
|
| 28 |
|
| 29 |
## Results
|
| 30 |
+
- Mean reward: **83.60**
|
| 31 |
+
- Std reward: **50.09**
|
replay.mp4
CHANGED
|
Binary files a/replay.mp4 and b/replay.mp4 differ
|
|
|
results.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
{
|
| 2 |
"env_id": "CartPole-v1",
|
| 3 |
-
"mean_reward":
|
| 4 |
-
"std_reward":
|
| 5 |
"n_evaluation_episodes": 10,
|
| 6 |
-
"eval_datetime": "2026-01-07T04:
|
| 7 |
}
|
|
|
|
| 1 |
{
|
| 2 |
"env_id": "CartPole-v1",
|
| 3 |
+
"mean_reward": 83.6,
|
| 4 |
+
"std_reward": 50.08632547911655,
|
| 5 |
"n_evaluation_episodes": 10,
|
| 6 |
+
"eval_datetime": "2026-01-07T04:36:19.605208"
|
| 7 |
}
|