Update README.md
Browse files
README.md
CHANGED
|
@@ -254,7 +254,7 @@ print(response)
|
|
| 254 |
- **Library**: [Open-R1](https://github.com/huggingface/open-r1)
|
| 255 |
- **Total Training Tokens**: 100B Tokens
|
| 256 |
- **Framework**: PyTorch with Transformers and TRL
|
| 257 |
-
- **Optimization**: DeepSpeed ZeRO-
|
| 258 |
- **Memory Optimization**: Gradient checkpointing, Liger kernels
|
| 259 |
- **Monitoring**: Weights & Biases integration
|
| 260 |
- **Hardware Used**: 8xB200 GPUs
|
|
@@ -300,4 +300,4 @@ This model is released under the Llama 3.1 Community License. Please see the [of
|
|
| 300 |
|
| 301 |
## Model Card Contact
|
| 302 |
|
| 303 |
-
For questions about this model card or the model itself, please open an issue in the model repository
|
|
|
|
| 254 |
- **Library**: [Open-R1](https://github.com/huggingface/open-r1)
|
| 255 |
- **Total Training Tokens**: 100B Tokens
|
| 256 |
- **Framework**: PyTorch with Transformers and TRL
|
| 257 |
+
- **Optimization**: DeepSpeed ZeRO-2
|
| 258 |
- **Memory Optimization**: Gradient checkpointing, Liger kernels
|
| 259 |
- **Monitoring**: Weights & Biases integration
|
| 260 |
- **Hardware Used**: 8xB200 GPUs
|
|
|
|
| 300 |
|
| 301 |
## Model Card Contact
|
| 302 |
|
| 303 |
+
For questions about this model card or the model itself, please open an issue in the model repository.
|