GX-XinGao commited on
Commit
2db8cd7
·
verified ·
1 Parent(s): 1480bd9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -20
README.md CHANGED
@@ -19,7 +19,7 @@ metrics:
19
  # Qwen2.5-7B-ODA-Mixture-100k
20
  <img src="performance.png" alt="Leaderboard Performance" width="1200" />
21
 
22
- Qwen2.5-7B-ODA-Mixture-100k is a supervised fine-tuned (SFT) model built on top of **Qwen2.5-7B-Base**, trained with **[ODA-Mixture-100k](https://huggingface.co/datasets/OpenDataArena/ODA-Mixture-100k)**. This training set is curated by mixing top-performing open corpora selected via the *[OpenDataArena](https://github.com/OpenDataArena/OpenDataArena-Tool)* leaderboard, and refined through deduplication and benchmark decontamination, aiming to improve the model’s general capabilities across **General**, **Math**, **Code**, and **Reasoning** domains under a compact ~100K data budget.
23
 
24
  ---
25
 
@@ -221,25 +221,6 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
221
 
222
  ---
223
 
224
- ## 🏋️ Training Hyperparameters
225
-
226
- The following hyperparameters were used during training:
227
-
228
- - **learning_rate**: 5e-05
229
- - **train_batch_size**: 1
230
- - **eval_batch_size**: 8
231
- - **seed**: 42
232
- - **distributed_type**: multi-GPU
233
- - **num_devices**: 32
234
- - **total_train_batch_size**: 32
235
- - **total_eval_batch_size**: 256
236
- - **optimizer**: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
237
- - **lr_scheduler_type**: cosine
238
- - **lr_scheduler_warmup_ratio**: 0.1
239
- - **num_epochs**: 3.0
240
-
241
- ---
242
-
243
  ## 📚 Citation
244
 
245
  If you use this model or its training data (ODA-Mixture-100k), please cite:
 
19
  # Qwen2.5-7B-ODA-Mixture-100k
20
  <img src="performance.png" alt="Leaderboard Performance" width="1200" />
21
 
22
+ Qwen2.5-7B-ODA-Mixture-100k is a supervised fine-tuned (SFT) model built on top of **Qwen2.5-7B-Base**, trained with **[ODA-Mixture-100k](https://huggingface.co/datasets/OpenDataArena/ODA-Mixture-100k)**. This training set is curated by mixing top-performing open corpora selected via the *[OpenDataArena](https://opendataarena.github.io)* leaderboard, and refined through deduplication and benchmark decontamination, aiming to improve the model’s general capabilities across **General**, **Math**, **Code**, and **Reasoning** domains under a compact ~100K data budget.
23
 
24
  ---
25
 
 
221
 
222
  ---
223
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
224
  ## 📚 Citation
225
 
226
  If you use this model or its training data (ODA-Mixture-100k), please cite: