HenryShan
/

MathLlama3.2

Text Generation

chain-of-thought

text-generation-inference

Model card Files Files and versions

HenryShan commited on Oct 17, 2025

Commit

4f5a5fc

·

verified ·

1 Parent(s): 35348ef

Update README.md

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -113,6 +113,13 @@ print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
 - **Training Steps**: Sufficient iterations to achieve mathematical reasoning improvements
 - **Hardware**: Trained on an Apple M4 Max Computer using [MLX](https://github.com/ml-explore/mlx)
 ## Applications
 This model is particularly well-suited for:

 - **Training Steps**: Sufficient iterations to achieve mathematical reasoning improvements
 - **Hardware**: Trained on an Apple M4 Max Computer using [MLX](https://github.com/ml-explore/mlx)
+### Training Performance
+The model was trained for 50 epochs, and its performance was tracked using Weights & Biases (wandb).
+The graph below shows the training loss and validation loss (val_loss) throughout the fine-tuning process.
+![Screenshot 2025-10-16 at 8.45.10 PM](https://cdn-uploads.huggingface.co/production/uploads/656ff69cd81a20af9c24b848/SnDzmDytorNNxYKVl8vnR.png)
 ## Applications
 This model is particularly well-suited for: