Neural-Hacker
/

OpenMath

+---
+license: cc-by-nc-4.0
+datasets:
+- openai/gsm8k
+language:
+- en
+base_model:
+- Qwen/Qwen2.5-Math-1.5B
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- math
+- qwen
+- lora
+- mathematics
+- gsm8k
+---
+# OpenMath
+Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning
+## Overview
+OpenMath is an open-source project focused on fine-tuning a small language model for math reasoning using QLoRA (4-bit LoRA).
+This repository contains only a LoRA adapter trained on GSM8K. Users must load the base model separately and attach the adapter.
+The latest version of this model was trained on an AMD MI300X GPU using ROCm, showing that modern non-NVIDIA accelerators can successfully support large-scale fine-tuning with Hugging Face and PyTorch.
+---
+## Base Model
+Qwen/Qwen2.5-Math-1.5B
+This repository does not contain the base model weights — they must be loaded from Hugging Face.
+---
+## Hardware Used (Latest Training Run)
+GPU: AMD MI300X (ROCm 7.0)
+VRAM: 192 GB
+Operating System: Ubuntu 24.04
+Framework: PyTorch + Hugging Face
+Backend: ROCm
+---
+## Dataset
+GSM8K (Grade School Math 8K)
+Training samples: 1,000
+Evaluation: Full GSM8K test split (1,319 problems)
+Only the solution portion of each example was used for loss computation through loss masking.
+---
+## Training Configuration
+Method: QLoRA (4-bit)
+Quantization: NF4 with float16 compute
+LoRA rank: 16
+LoRA alpha: 32
+LoRA dropout: 0.05
+Target modules: q_proj, k_proj, v_proj, o_proj
+Max sequence length: 1024
+Batch size: 1
+Gradient accumulation: 16
+Effective batch size: 16
+Learning rate: 1e-4
+Optimizer: paged_adamw_8bit
+Scheduler: cosine
+Warmup: 5 percent
+Epochs: 6
+---
+## Results
+GSM8K Accuracy (Full Test Set):
+750 out of 1319 correct, which equals 56.86 percent accuracy.
+This is significantly stronger than the earlier Colab T4 run and is a strong result for a 1.5B model trained with LoRA.
+---
+## What This Repository Contains
+adapter_model.safetensors — LoRA weights
+adapter_config.json — LoRA configuration
+chat_template.jinja — chat formatting template
+tokenizer.json — tokenizer file
+tokenizer_config.json — tokenizer settings
+README.md — documentation
+This repository does not include checkpoints, optimizer states, or full base model weights.
+---
+## How to Use This Model
+Load the base model Qwen/Qwen2.5-Math-1.5B from Hugging Face, then attach this LoRA adapter using PEFT. Generate answers using a prompt that includes an instruction, problem, and solution section.
+---
+## Why This Matters
+This project demonstrates that AMD MI300X can train modern language models with Hugging Face and QLoRA.
+It shows that high-quality math reasoning is possible at 1.5B parameters using efficient fine-tuning.
+It provides a lightweight adapter instead of requiring users to download a massive full model.
+---
+## Limitations
+The model can make reasoning mistakes.
+It should not be used for exams, assignments, or professional decisions.
+Performance depends heavily on prompt formatting.
+---
+## Future Work
+Train on 3,000 to 5,000 GSM8K samples.
+Add SVAMP and ASDiv datasets.
+Improve decoding to reduce repetition.
+Experiment with multi-GPU scaling on MI300X.
+Add a Streamlit demo for interactive use.
+---
+## License
+cc-by-nc-4.0