Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Delta-Vector
/
distill-m-6a3lnzvb-code
like
0
Model card
Files
Files and versions
xet
Community
Copy to bucket
new
e9ce4f0
distill-m-6a3lnzvb-code
/
scripts
3.83 kB
Ctrl+K
Ctrl+K
1 contributor
History:
3 commits
Delta-Vector
fix OOM: chunked KL with checkpointing + PYTORCH_CUDA_ALLOC_CONF expandable_segments; add kl_chunk_size config key
eb5278f
verified
about 2 months ago
backup_to_hf.py
1.64 kB
fix OOM: chunked KL with checkpointing + PYTORCH_CUDA_ALLOC_CONF expandable_segments; add kl_chunk_size config key
about 2 months ago
run_sweep.sh
1.14 kB
add grow_layers, sweep configs (replicate_zero4, grow40_winning, grow40_simple), sweep runner
about 2 months ago
run_sweep_rerun.sh
1.05 kB
fix OOM: chunked KL with checkpointing + PYTORCH_CUDA_ALLOC_CONF expandable_segments; add kl_chunk_size config key
about 2 months ago