BikoRiko
/

GPT-2.5-Math

Text Generation

Model card Files Files and versions

GPT-2.5-Math / README.md

BikoRiko's picture

Update README.md

5e4fcd0 verified 6 days ago

|

history blame contribute delete

928 Bytes

	---
	language: en
	license: mit
	base_model: BikoRiko/GPT-2.4-High-Pro
	tags:
	- gpt2
	- math
	- fine-tuned
	- Pro
	- Math
	pipeline_tag: text-generation
	---

	# GPT-2.5-Math

	GPT-2.5-Math is an upgraded version of BikoRiko/GPT-2.4-High-Pro, featuring an expanded architecture and specialized fine-tuning on mathematical reasoning.

	## Model Details
	- Architecture: GPT-2 with 6 additional layers (Total parameters ~0.2B).
	- Training Hardware: NVIDIA H100 (via Modal.com).
	- Dataset: 5% subset of `microsoft/orca-math-word-problems-200k`.
	- Objective: Fine-tuned to solve math word problems and logical queries.

	## Performance
	The model is trained for mathematical reasoning. While it is a 0.2B parameter model, it demonstrates the beginning of logical grounding for basic word problems.

	## Training Details
	- Optimizer: AdamW
	- Precision: Mixed Precision (torch.amp)
	- Epochs: 3
	- Learning Rate: 5e-5