Ctrl+K
- 1e-6_1.0std
- 1e-6_kl
- 1e-6_kl_halfstd
- 1e-6_kl_sampling
- 1e-6kl_0.5std
- baseline
- consistency_baseline
- cosine
- dt0_1
- global_over_four_channel_mean
- global_over_two_channel_mean
- global_std_channel_mean
- learned_cfg
- logit
- naive_baseline
- naive_normalize
- no_dut
- normal_mean_scale_std
- rmsnorm
- rotary
- rotary2
- swiglu
- tminus_fixed
- whiten
- zero_end_targets
- 3.34 kB
- 2.1 GB xet