Buckets:
675 MB
7 files
Updated 8 days ago
Ctrl+K
| Name | Size | Uploaded | Xet hash |
|---|---|---|---|
| .gitattributes | 1.52 kB xet | 818ba6de | |
| README.md | 928 Bytes xet | 1621008f | |
| config.json | 961 Bytes xet | 7c68ce90 | |
| generation_config.json | 203 Bytes xet | 684ac575 | |
| model.safetensors | 671 MB xet | c931025f | |
| tokenizer.json | 3.56 MB xet | d270cd36 | |
| tokenizer_config.json | 517 Bytes xet | 1677fddd |
GPT-2.5-Math
GPT-2.5-Math is an upgraded version of BikoRiko/GPT-2.4-High-Pro, featuring an expanded architecture and specialized fine-tuning on mathematical reasoning.
Model Details
- Architecture: GPT-2 with 6 additional layers (Total parameters ~0.2B).
- Training Hardware: NVIDIA H100 (via Modal.com).
- Dataset: 5% subset of
microsoft/orca-math-word-problems-200k. - Objective: Fine-tuned to solve math word problems and logical queries.
Performance
The model is trained for mathematical reasoning. While it is a 0.2B parameter model, it demonstrates the beginning of logical grounding for basic word problems.
Training Details
- Optimizer: AdamW
- Precision: Mixed Precision (torch.amp)
- Epochs: 3
- Learning Rate: 5e-5
- Total size
- 675 MB
- Files
- 7
- Last updated
- May 20
- Pre-warmed CDN
- US EU US EU