JingzeShi commited on
Commit
1f9da74
·
verified ·
1 Parent(s): 089ab87

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -16,9 +16,9 @@ Doge uses `wsd_scheduler` as the training scheduler, which divides the learning
16
 
17
  Here are the initial learning rates required to continue training at each checkpoint:
18
 
19
- - **[Doge-20M](https://huggingface.co/JingzeShi/Doge-20M-checkpoint)**: 8e-3
20
- - **[Doge-60M](https://huggingface.co/JingzeShi/Doge-60M-checkpoint)**: 6e-3
21
- - **Doge-160M**: 4e-3
22
  - **Doge-320M**: 2e-3
23
 
24
  | Model | Learning Rate | Schedule | Warmup Steps | Stable Steps |
 
16
 
17
  Here are the initial learning rates required to continue training at each checkpoint:
18
 
19
+ - **[Doge-20M](https://huggingface.co/SmallDoge/Doge-20M-checkpoint)**: 8e-3
20
+ - **[Doge-60M](https://huggingface.co/SmallDoge/Doge-60M-checkpoint)**: 6e-3
21
+ - **[Doge-160M]((https://huggingface.co/SmallDoge/Doge-160M-checkpoint))**: 4e-3
22
  - **Doge-320M**: 2e-3
23
 
24
  | Model | Learning Rate | Schedule | Warmup Steps | Stable Steps |