| --- |
| license: apache-2.0 |
| tags: |
| - serl |
| - reinforcement-learning |
| - qwen2.5 |
| - checkpoints |
| --- |
| |
| # SeRL Training Checkpoints |
|
|
| Compressed checkpoints from SeRL (Self-Evolving Reinforcement Learning) experiments. |
| Files use **ZipNN lossless compression** (~33% smaller, transparent loading). |
|
|
| ## Quick Start |
|
|
| ```bash |
| pip install zipnn huggingface_hub transformers |
| ``` |
|
|
| ```python |
| # Enable ZipNN transparent loading |
| from zipnn import zipnn_hf |
| zipnn_hf() |
| |
| from huggingface_hub import snapshot_download |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| # Download specific checkpoint |
| path = snapshot_download( |
| "AshwinKM2005/serl-checkpoints", |
| allow_patterns="serl_arce_qwen25_1_5b/huggingface/*" |
| ) |
| |
| # Load model (auto-decompresses .znn files) |
| model = AutoModelForCausalLM.from_pretrained( |
| f"{path}/serl_arce_qwen25_1_5b/huggingface/global_step200_hf" |
| ) |
| ``` |
|
|
| ## Experiments |
|
|
| | Experiment | Base Model | Dataset | |
| |------------|-----------|---------| |
| | serl_arcc_qwen25_0_5b | Qwen2.5-0.5B | ARC-Challenge | |
| | serl_ARC-c_qwen25_1_5b | Qwen2.5-1.5B | ARC-Challenge | |
| | serl_arce_qwen25_0_5b | Qwen2.5-0.5B | ARC-Easy | |
| | serl_arce_qwen25_1_5b | Qwen2.5-1.5B | ARC-Easy | |
|
|
| ## Structure |
|
|
| ``` |
| serl-checkpoints/ |
| βββ serl_arcc_qwen25_0_5b/ |
| β βββ huggingface/global_step100_hf/ |
| β βββ deepspeed/global_step100/ |
| βββ serl_ARC-c_qwen25_1_5b/ |
| β βββ ... |
| βββ ... |
| ``` |
|
|
| ## Compression |
|
|
| Files ending in `.safetensors.znn` are ZipNN compressed. |
| The `zipnn_hf()` hook enables transparent loading. |
|
|