| --- |
| language: en |
| license: mit |
| base_model: BikoRiko/GPT-2.4-High-Pro |
| tags: |
| - gpt2 |
| - math |
| - fine-tuned |
| - Pro |
| - Math |
| pipeline_tag: text-generation |
| --- |
| |
| # GPT-2.5-Math |
|
|
| GPT-2.5-Math is an upgraded version of **BikoRiko/GPT-2.4-High-Pro**, featuring an expanded architecture and specialized fine-tuning on mathematical reasoning. |
|
|
| ## Model Details |
| - **Architecture:** GPT-2 with 6 additional layers (Total parameters ~0.2B). |
| - **Training Hardware:** NVIDIA H100 (via Modal.com). |
| - **Dataset:** 5% subset of `microsoft/orca-math-word-problems-200k`. |
| - **Objective:** Fine-tuned to solve math word problems and logical queries. |
|
|
| ## Performance |
| The model is trained for mathematical reasoning. While it is a 0.2B parameter model, it demonstrates the beginning of logical grounding for basic word problems. |
|
|
| ## Training Details |
| - **Optimizer:** AdamW |
| - **Precision:** Mixed Precision (torch.amp) |
| - **Epochs:** 3 |
| - **Learning Rate:** 5e-5 |