AshwinKM2005 commited on
Commit
e855284
·
verified ·
1 Parent(s): 5d9ea33

Upload SeRL checkpoints with ZipNN compression

Browse files
Files changed (1) hide show
  1. README.md +60 -0
README.md ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - serl
5
+ - reinforcement-learning
6
+ - qwen2.5
7
+ - checkpoints
8
+ ---
9
+
10
+ # SeRL Training Checkpoints
11
+
12
+ Compressed checkpoints from SeRL (Self-Evolving Reinforcement Learning) experiments.
13
+ Files use **ZipNN lossless compression** (~33% smaller, transparent loading).
14
+
15
+ ## Quick Start
16
+
17
+ ```bash
18
+ pip install zipnn huggingface_hub transformers
19
+ ```
20
+
21
+ ```python
22
+ # Enable ZipNN transparent loading
23
+ from zipnn import zipnn_hf
24
+ zipnn_hf()
25
+
26
+ from huggingface_hub import snapshot_download
27
+ from transformers import AutoModelForCausalLM, AutoTokenizer
28
+
29
+ # Download specific checkpoint
30
+ path = snapshot_download(
31
+ "AshwinKM2005/serl-checkpoints",
32
+ allow_patterns="serl_arce_qwen25_1_5b/huggingface/global_step200_hf/*"
33
+ )
34
+
35
+ # Load model (auto-decompresses .znn files)
36
+ model = AutoModelForCausalLM.from_pretrained(
37
+ f"{path}/serl_arce_qwen25_1_5b/huggingface/global_step200_hf"
38
+ )
39
+ tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5B")
40
+ ```
41
+
42
+ ## Experiments
43
+
44
+ | Experiment | Base Model | Dataset |
45
+ |------------|-----------|---------|
46
+ | serl_arcc_qwen25_0_5b | Qwen2.5-0.5B | ARC-Challenge |
47
+ | serl_ARC-c_qwen25_1_5b | Qwen2.5-1.5B | ARC-Challenge |
48
+ | serl_arce_qwen25_0_5b | Qwen2.5-0.5B | ARC-Easy |
49
+ | serl_arce_qwen25_1_5b | Qwen2.5-1.5B | ARC-Easy |
50
+
51
+ ## Structure
52
+
53
+ Each experiment contains:
54
+ - `huggingface/` - HuggingFace-compatible checkpoints
55
+ - `deepspeed/` - DeepSpeed training checkpoints
56
+
57
+ ## Compression
58
+
59
+ Files ending in `.safetensors.znn` are ZipNN compressed.
60
+ The `zipnn_hf()` hook enables transparent loading without manual decompression.