| ### UNUSED, BUT NOT USEFUL |
|
|
| A rerun of the models |
| ``` |
| al_0.75_g_0.97_seed_122_pa_1 |
| al_0.75_g_0.97_seed_131_pa_1 |
| al_0.75_g_0.97_seed_200_pa_1 |
| ``` |
| as previous runs with the same seed had chaotic loss/regret curves. Didn't replicate with using the same seed, attribute bad runs to faulty hardware/something I can't control. |
|
|
| Wandb: https://wandb.ai/devinterp/jaxgmg2_cursed |
| |
| Hyperparams: |
| ``` |
| rl_action=train |
| model_type=impala |
| lr=5e-05 |
| discount_rate=0.97 |
| num_rollout_steps=64 |
| grad_acc_per_chunk=4 |
| num_rollout_chunks=1 |
| cheese_loc=any |
| env_layout=open |
| alpha=0.75 |
| env_size=13 |
| num_levels=9600 |
| compile=True |
| use_prev_action=False |
| weight_restrictions=None |
| weight_restrictions_invert=False |
| use_bf16=False |
| use_wandb=True |
| seed=122 |
| mask_type=first_episode |
| ckpt_dir=jaxgmg2_cursed |
| vis_average_state=False |
| trim_episodes=False |
| num_total_env_steps=9999974400 |
| eval_every=1 |
| eff_horizon=None |
| optim=adam |
| env_rule=None |
| env_rule_mixture=None |
| hf_user=davidquarel |
| hf_collection=davidquarel/jaxgmg |
| use_hf=True |
| num_hf_uploads=1 |
| use_log=True |
| log_optimizer_state=False |
| resume=None |
| resume_id=None |
| resume_optim=False |
| checkpoint=al_0.75_g_0.97_seed_122_pa_1 |
| wandb_project=jaxgmg2_cursed |
| eval_schedule=0:1,250:2,500:5,2000:10 |
| render_sixel=False |
| sixel_idx=60 |
| live_monitor=False |
| run_id=0 |
| seed_formula=None |
| deterministic=True |
| penalize_time=False |
| f_str_ckpt=al_0.75_g_0.97_seed_122_pa_1 |
| duplication_factor=-1 |
| smoke=False |
| ntfy=david_jaxgmg |
| num_chains=6 |
| num_draws=3000 |
| num_steps_bw_draws=1 |
| on_policy=True |
| llc_nbeta=3000 |
| localization=10 |
| exact_solver_each_draw=False |
| llc_optimizer=sgld |
| iw_clip_eps=None |
| rmsprop_burnin_steps=20 |
| llc_data_file=llc_scan_open_reinforce.pkl |
| llc_checkpoint_index=None |
| llc_checkpoint_number=None |
| sink=None |
| repo_id=davidquarel/jaxgmg_ckpt_zip |
| use_shuffled_checkpoints=False |
| force_re_download=False |
| off_distribution_data=False |
| evaluate_every_position=False |
| num_prev_actions=1 |
| eff_acc_steps=4 |
| chunk_size=9600 |
| env_steps_per_microbatch=153600 |
| ckpt_path=jaxgmg2_cursed/al_0.75_g_0.97_seed_122_pa_1 |
| env_steps_per_loop=614400 |
| total_loops=16276 |
| ``` |