Commit
·
485fe66
1
Parent(s):
24aea20
Update README.md
Browse files
README.md
CHANGED
|
@@ -60,4 +60,4 @@ This model is nearing SOTA performance for the Freeway environment: https://www.
|
|
| 60 |
|
| 61 |
The composite score at 10 million timesteps is ~32 which is only two points off SOTA of 34. It appears that with PPO even after 2BN timesteps performance can only reach 33.6 - https://huggingface.co/edbeeching/atari_2B_atari_freeway_3333
|
| 62 |
|
| 63 |
-
I suspect that as with QR-DQN the SAC and TQC models can reach 34 - they just need more training to do so.
|
|
|
|
| 60 |
|
| 61 |
The composite score at 10 million timesteps is ~32 which is only two points off SOTA of 34. It appears that with PPO even after 2BN timesteps performance can only reach 33.6 - https://huggingface.co/edbeeching/atari_2B_atari_freeway_3333
|
| 62 |
|
| 63 |
+
I suspect that as with QR-DQN the SAC and TQC models can reach 34 - they just need more training to do so.
|