| ### OBSOLETE |
|
|
| A series of models trained while varying over `duplication_factor` when the cheese was always in the corner, meaning there were only `120` possible states for the environment to be in. No longer relevant as we train with `alpha`, given the enviromental distribution $\Lambda_{alpha} = \alpha \Lambda_{1} + (1- \alpha) \Lambda_{0}$ where |
| * \Lambda_1 : uniform distribution over all cheese/mouse positions |
| * \Lambda_0 : cheese always in corner, mouse uniform over all positions |
| |
| No longer relevant as `duplication_factor` has since been removed as there are now ~14k many states instead of 120. |