Update README.md
Browse files
README.md
CHANGED
|
@@ -113,6 +113,13 @@ print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
|
|
| 113 |
- **Training Steps**: Sufficient iterations to achieve mathematical reasoning improvements
|
| 114 |
- **Hardware**: Trained on an Apple M4 Max Computer using [MLX](https://github.com/ml-explore/mlx)
|
| 115 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 116 |
## Applications
|
| 117 |
|
| 118 |
This model is particularly well-suited for:
|
|
|
|
| 113 |
- **Training Steps**: Sufficient iterations to achieve mathematical reasoning improvements
|
| 114 |
- **Hardware**: Trained on an Apple M4 Max Computer using [MLX](https://github.com/ml-explore/mlx)
|
| 115 |
|
| 116 |
+
### Training Performance
|
| 117 |
+
|
| 118 |
+
The model was trained for 50 epochs, and its performance was tracked using Weights & Biases (wandb).
|
| 119 |
+
The graph below shows the training loss and validation loss (val_loss) throughout the fine-tuning process.
|
| 120 |
+
|
| 121 |
+

|
| 122 |
+
|
| 123 |
## Applications
|
| 124 |
|
| 125 |
This model is particularly well-suited for:
|