jaxgmg_ckpt_df / README.md
davidquarel's picture
Create README.md
6b81eb7 verified
### OBSOLETE
A series of models trained while varying over `duplication_factor` when the cheese was always in the corner, meaning there were only `120` possible states for the environment to be in. No longer relevant as we train with `alpha`, given the enviromental distribution $\Lambda_{alpha} = \alpha \Lambda_{1} + (1- \alpha) \Lambda_{0}$ where
* \Lambda_1 : uniform distribution over all cheese/mouse positions
* \Lambda_0 : cheese always in corner, mouse uniform over all positions
No longer relevant as `duplication_factor` has since been removed as there are now ~14k many states instead of 120.