--- license: apache-2.0 tags: - serl - reinforcement-learning - qwen2.5 - checkpoints --- # SeRL Training Checkpoints Compressed checkpoints from SeRL (Self-Evolving Reinforcement Learning) experiments. Files use **ZipNN lossless compression** (~33% smaller, transparent loading). ## Quick Start ```bash pip install zipnn huggingface_hub transformers ``` ```python # Enable ZipNN transparent loading from zipnn import zipnn_hf zipnn_hf() from huggingface_hub import snapshot_download from transformers import AutoModelForCausalLM, AutoTokenizer # Download specific checkpoint path = snapshot_download( "AshwinKM2005/serl-checkpoints", allow_patterns="serl_arce_qwen25_1_5b/huggingface/*" ) # Load model (auto-decompresses .znn files) model = AutoModelForCausalLM.from_pretrained( f"{path}/serl_arce_qwen25_1_5b/huggingface/global_step200_hf" ) ``` ## Experiments | Experiment | Base Model | Dataset | |------------|-----------|---------| | serl_arcc_qwen25_0_5b | Qwen2.5-0.5B | ARC-Challenge | | serl_ARC-c_qwen25_1_5b | Qwen2.5-1.5B | ARC-Challenge | | serl_arce_qwen25_0_5b | Qwen2.5-0.5B | ARC-Easy | | serl_arce_qwen25_1_5b | Qwen2.5-1.5B | ARC-Easy | ## Structure ``` serl-checkpoints/ ├── serl_arcc_qwen25_0_5b/ │ ├── huggingface/global_step100_hf/ │ └── deepspeed/global_step100/ ├── serl_ARC-c_qwen25_1_5b/ │ └── ... └── ... ``` ## Compression Files ending in `.safetensors.znn` are ZipNN compressed. The `zipnn_hf()` hook enables transparent loading.