Update README.md
Browse files
README.md
CHANGED
|
@@ -32,7 +32,6 @@ KillChain-8B is intended for:
|
|
| 32 |
|
| 33 |
### Training hyperparameters
|
| 34 |
|
| 35 |
-
The following hyperparameters were used during training:
|
| 36 |
- learning_rate: 1.5e-05
|
| 37 |
- train_batch_size: 4
|
| 38 |
- eval_batch_size: 4
|
|
@@ -54,7 +53,14 @@ The following hyperparameters were used during training:
|
|
| 54 |
- Datasets 4.0.0
|
| 55 |
- Tokenizers 0.22.1
|
| 56 |
|
| 57 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
|
| 59 |
<details><summary>See axolotl config</summary>
|
| 60 |
|
|
|
|
| 32 |
|
| 33 |
### Training hyperparameters
|
| 34 |
|
|
|
|
| 35 |
- learning_rate: 1.5e-05
|
| 36 |
- train_batch_size: 4
|
| 37 |
- eval_batch_size: 4
|
|
|
|
| 53 |
- Datasets 4.0.0
|
| 54 |
- Tokenizers 0.22.1
|
| 55 |
|
| 56 |
+
### Equipment used for training, ~1 hour real time
|
| 57 |
+
|
| 58 |
+
4x NVIDIA H200 SXM
|
| 59 |
+

|
| 60 |
+
|
| 61 |
+
|
| 62 |
+
### Axolotl Config
|
| 63 |
+
|
| 64 |
[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
|
| 65 |
<details><summary>See axolotl config</summary>
|
| 66 |
|