added more training details
Browse files
README.md
CHANGED
|
@@ -30,6 +30,15 @@ The RL Zoo is a training framework for Stable Baselines3
|
|
| 30 |
reinforcement learning agents,
|
| 31 |
with hyperparameter optimization and pre-trained agents included.
|
| 32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
## Usage (with SB3 RL Zoo)
|
| 34 |
|
| 35 |
RL Zoo: https://github.com/DLR-RM/rl-baselines3-zoo<br/>
|
|
|
|
| 30 |
reinforcement learning agents,
|
| 31 |
with hyperparameter optimization and pre-trained agents included.
|
| 32 |
|
| 33 |
+
## Training details
|
| 34 |
+
|
| 35 |
+
This might seem like cheating but ...
|
| 36 |
+
it doesn't feel right to burn 90 minutes of GPU time (and tying up a colab machine) to reporduce a model that is already available on the hub ...
|
| 37 |
+
so I've borrowed a trained model and continuted training for just a few steps.
|
| 38 |
+
|
| 39 |
+
I'm wondering if dqn training could be made faster/more efficient if we train the CNN in a semi-supervised style rather than starting with random weights ...
|
| 40 |
+
i.e given the previous 4 frames, can the model predict the next frame ...
|
| 41 |
+
|
| 42 |
## Usage (with SB3 RL Zoo)
|
| 43 |
|
| 44 |
RL Zoo: https://github.com/DLR-RM/rl-baselines3-zoo<br/>
|