sswoo123 commited on
Commit
f60067c
·
verified ·
1 Parent(s): ecfa7bf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -0
README.md CHANGED
@@ -160,6 +160,53 @@ chat_prompt = tokenizer.apply_chat_template(
160
  )
161
  ```
162
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
163
  ## Contact
164
  - KyungTae Lim, Professor at KAIST. `ktlim@kaist.ac.kr`
165
 
 
160
  )
161
  ```
162
  ---
163
+
164
+ ## 🪄 Using Specific Revisions (Training Checkpoints)
165
+
166
+ KORMo provides multiple model revisions corresponding to different training stages and checkpoints.
167
+ You can load a specific revision with the `revision` parameter in `from_pretrained`.
168
+
169
+ ### 📍 Stage 1 Model (sft-stage1)
170
+
171
+ ```python
172
+ from transformers import AutoModelForCausalLM, AutoTokenizer
173
+ import torch
174
+
175
+ model_name = "KORMo-Team/KORMo-10B-sft"
176
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
177
+ model = AutoModelForCausalLM.from_pretrained(
178
+ model_name,
179
+ revision="sft-stage1", # Load Stage 1 checkpoint
180
+ torch_dtype=torch.bfloat16,
181
+ device_map="auto",
182
+ trust_remote_code=True
183
+ )
184
+ ```
185
+
186
+ ### 🚀 Main Model (Final Checkpoint: sft-stage2-ckpt2)
187
+
188
+ ```python
189
+ from transformers import AutoModelForCausalLM, AutoTokenizer
190
+ import torch
191
+
192
+ model_name = "KORMo-Team/KORMo-10B-sft"
193
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
194
+ model = AutoModelForCausalLM.from_pretrained(
195
+ model_name,
196
+ revision="sft-stage2-ckpt2", # Load Final Main Checkpoint
197
+ torch_dtype=torch.bfloat16,
198
+ device_map="auto",
199
+ trust_remote_code=True
200
+ )
201
+ ```
202
+
203
+ > 💡 **Tip**:
204
+ > - Use `sft-stage1` for ablation studies or comparison experiments.
205
+ > - Use `sft-stage2-ckpt2` as the **main production model**.
206
+
207
+ ---
208
+
209
+
210
  ## Contact
211
  - KyungTae Lim, Professor at KAIST. `ktlim@kaist.ac.kr`
212