replaced hyperparameter table with a premade png
Browse files
README.md
CHANGED
|
@@ -107,13 +107,7 @@ Our training has 3 stages:
|
|
| 107 |
For details of the training dataset for each stage, please refer to the Dataset section and our CrystalCoder Data Card.
|
| 108 |
|
| 109 |
For hyperparameters used in each stage, please refer to the following table:
|
| 110 |
-
|
| 111 |
-
| **Hyperparameter** | **Phase 1** | **Phase 2** | **Phase 3** |
|
| 112 |
-
| --- | --- | --- | --- |
|
| 113 |
-
| LR Warmup Steps| 86 | 86 | 276 |
|
| 114 |
-
| LR Start Value | 0.012 | 0.0087825 | 0.002 |
|
| 115 |
-
| LR Final Value | 0.00012408 | 0.00013679 | 0.0002 |
|
| 116 |
-
| LR Decay | Linear | Linear | Linear |
|
| 117 |
|
| 118 |
For more details of training, please refer to [our paper](https://arxiv.org/pdf/2312.06550.pdf).
|
| 119 |
|
|
|
|
| 107 |
For details of the training dataset for each stage, please refer to the Dataset section and our CrystalCoder Data Card.
|
| 108 |
|
| 109 |
For hyperparameters used in each stage, please refer to the following table:
|
| 110 |
+
<center><img src="hyperparameters.png" alt="hyperparameter table" /></center>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 111 |
|
| 112 |
For more details of training, please refer to [our paper](https://arxiv.org/pdf/2312.06550.pdf).
|
| 113 |
|