Commit History

fix OOM: chunked KL with checkpointing + PYTORCH_CUDA_ALLOC_CONF expandable_segments; add kl_chunk_size config key
eb5278f
verified

Delta-Vector commited on

add grow_layers, sweep configs (replicate_zero4, grow40_winning, grow40_simple), sweep runner
3f04365
verified

Delta-Vector commited on

initial scaffold: distill.py + base/zero_14_17 configs + accelerate yaml
f6e42f8
verified

Delta-Vector commited on