Neural-Hacker
/

OpenMath

Text Generation

Model card Files Files and versions

OpenMath / README.md

Neural-Hacker's picture

Update README.md

9cc842e verified 2 days ago

|

history blame contribute delete

3.4 kB

	---
	license: cc-by-nc-4.0
	datasets:
	- openai/gsm8k
	language:
	- en
	base_model:
	- Qwen/Qwen2.5-Math-1.5B
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- math
	- qwen
	- lora
	- mathematics
	- gsm8k
	---

	# OpenMath
	Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning

	## Overview
	OpenMath is an open-source project focused on fine-tuning a small language model for math reasoning using QLoRA (4-bit LoRA).

	This repository contains only a LoRA adapter trained on GSM8K. Users must load the base model separately and attach the adapter.

	The latest version of this model was trained on an AMD MI300X GPU using ROCm, showing that modern non-NVIDIA accelerators can successfully support large-scale fine-tuning with Hugging Face and PyTorch.

	---

	## Base Model
	Qwen/Qwen2.5-Math-1.5B

	This repository does not contain the base model weights — they must be loaded from Hugging Face.

	---

	## Hardware Used (Latest Training Run)

	GPU: AMD MI300X (ROCm 7.0)
	VRAM: 192 GB
	Operating System: Ubuntu 24.04
	Framework: PyTorch + Hugging Face
	Backend: ROCm

	---

	## Dataset

	GSM8K (Grade School Math 8K)
	Training samples: 1,000
	Evaluation: Full GSM8K test split (1,319 problems)

	Only the solution portion of each example was used for loss computation through loss masking.

	---

	## Training Configuration

	Method: QLoRA (4-bit)
	Quantization: NF4 with float16 compute
	LoRA rank: 16
	LoRA alpha: 32
	LoRA dropout: 0.05
	Target modules: q_proj, k_proj, v_proj, o_proj
	Max sequence length: 1024
	Batch size: 1
	Gradient accumulation: 16
	Effective batch size: 16
	Learning rate: 1e-4
	Optimizer: paged_adamw_8bit
	Scheduler: cosine
	Warmup: 5 percent
	Epochs: 6

	---

	## Results

	GSM8K Accuracy (Full Test Set):
	750 out of 1319 correct, which equals 56.86 percent accuracy.

	This is significantly stronger than the earlier Colab T4 run and is a strong result for a 1.5B model trained with LoRA.

	---

	## What This Repository Contains

	adapter_model.safetensors — LoRA weights
	adapter_config.json — LoRA configuration
	chat_template.jinja — chat formatting template
	tokenizer.json — tokenizer file
	tokenizer_config.json — tokenizer settings
	README.md — documentation

	This repository does not include checkpoints, optimizer states, or full base model weights.

	---

	## How to Use This Model

	Load the base model Qwen/Qwen2.5-Math-1.5B from Hugging Face, then attach this LoRA adapter using PEFT. Generate answers using a prompt that includes an instruction, problem, and solution section.

	---

	## Why This Matters

	This project demonstrates that AMD MI300X can train modern language models with Hugging Face and QLoRA.
	It shows that high-quality math reasoning is possible at 1.5B parameters using efficient fine-tuning.
	It provides a lightweight adapter instead of requiring users to download a massive full model.

	---

	## Limitations

	The model can make reasoning mistakes.
	It should not be used for exams, assignments, or professional decisions.
	Performance depends heavily on prompt formatting.

	---

	## Future Work

	Train on 3,000 to 5,000 GSM8K samples.
	Add SVAMP and ASDiv datasets.
	Improve decoding to reduce repetition.
	Experiment with multi-GPU scaling on MI300X.
	Add a Streamlit demo for interactive use.

	---

	## License

	cc-by-nc-4.0