675 MB
7 files
Updated 8 days ago
NameSize
.gitattributes1.52 kB
xet
README.md928 Bytes
xet
config.json961 Bytes
xet
generation_config.json203 Bytes
xet
model.safetensors671 MB
xet
tokenizer.json3.56 MB
xet
tokenizer_config.json517 Bytes
xet
README.md

GPT-2.5-Math

GPT-2.5-Math is an upgraded version of BikoRiko/GPT-2.4-High-Pro, featuring an expanded architecture and specialized fine-tuning on mathematical reasoning.

Model Details

  • Architecture: GPT-2 with 6 additional layers (Total parameters ~0.2B).
  • Training Hardware: NVIDIA H100 (via Modal.com).
  • Dataset: 5% subset of microsoft/orca-math-word-problems-200k.
  • Objective: Fine-tuned to solve math word problems and logical queries.

Performance

The model is trained for mathematical reasoning. While it is a 0.2B parameter model, it demonstrates the beginning of logical grounding for basic word problems.

Training Details

  • Optimizer: AdamW
  • Precision: Mixed Precision (torch.amp)
  • Epochs: 3
  • Learning Rate: 5e-5
Total size
675 MB
Files
7
Last updated
May 20
Pre-warmed CDN
US EU US EU

Contributors