HectorHe commited on
Commit
9017290
·
verified ·
1 Parent(s): b930d5d

End of training

Browse files
Files changed (2) hide show
  1. README.md +3 -1
  2. training.log +2 -0
README.md CHANGED
@@ -1,8 +1,10 @@
1
  ---
 
2
  library_name: transformers
3
  model_name: Deepseek-V2-13B-Math7K-Expert-Enhance-Subset-Expert-MoE-32-experts
4
  tags:
5
  - generated_from_trainer
 
6
  - trl
7
  - expert_enhance_distillation
8
  licence: license
@@ -10,7 +12,7 @@ licence: license
10
 
11
  # Model Card for Deepseek-V2-13B-Math7K-Expert-Enhance-Subset-Expert-MoE-32-experts
12
 
13
- This model is a fine-tuned version of [None](https://huggingface.co/None).
14
  It has been trained using [TRL](https://github.com/huggingface/trl).
15
 
16
  ## Quick start
 
1
  ---
2
+ datasets: HectorHe/math7k
3
  library_name: transformers
4
  model_name: Deepseek-V2-13B-Math7K-Expert-Enhance-Subset-Expert-MoE-32-experts
5
  tags:
6
  - generated_from_trainer
7
+ - open-r1
8
  - trl
9
  - expert_enhance_distillation
10
  licence: license
 
12
 
13
  # Model Card for Deepseek-V2-13B-Math7K-Expert-Enhance-Subset-Expert-MoE-32-experts
14
 
15
+ This model is a fine-tuned version of [None](https://huggingface.co/None) on the [HectorHe/math7k](https://huggingface.co/datasets/HectorHe/math7k) dataset.
16
  It has been trained using [TRL](https://github.com/huggingface/trl).
17
 
18
  ## Quick start
training.log CHANGED
@@ -1009,3 +1009,5 @@ Memory reserved: 13710.0
1009
  (lm_head): Linear(in_features=2048, out_features=102400, bias=False)
1010
  )
1011
  2025-08-18 19:00:43 - INFO - __main__ - *** Saving model ***
 
 
 
1009
  (lm_head): Linear(in_features=2048, out_features=102400, bias=False)
1010
  )
1011
  2025-08-18 19:00:43 - INFO - __main__ - *** Saving model ***
1012
+ 2025-08-18 19:09:42 - INFO - __main__ - Model saved to data/DeepSeek-Coder-V2-Lite-Instruct/expert_enhance/subset_expert_moe/math7K/32_experts
1013
+ 2025-08-18 19:09:42 - INFO - __main__ - Pushing to hub...