Org-Exp
/

M1-MathX

Text Generation

Model card Files Files and versions

M1-MathX / README.md

Parveshiiii's picture

Update README.md

7eb6b15 verified 2 months ago

|

2.23 kB

	---
	datasets:
	- XenArcAI/MathX-5M
	base_model:
	- google/gemma-3-1b-it
	---

	# Model Card: Parveshiiii/M1-MathX

	## Model Details
	- Model Name: Parveshiiii/M1-MathX
	- Base Architecture: Gemma (1B parameters)
	- Model Type: Causal Language Model (text-generation)
	- Training Framework: Hugging Face Transformers
	- Precision: fp16
	- Attention Mechanism: Hybrid sliding-window and full attention layers
	- Tokenizer: Gemma tokenizer (vocab size 262,144)

	## Intended Use
	- Designed for mathematical reasoning tasks, including problem solving, equation manipulation, and step-by-step derivations.
	- Suitable for educational contexts, math tutoring, and research experiments in reasoning alignment.
	- Not intended for general-purpose conversation or sensitive domains outside mathematics.

	## Training Data
	- Dataset: MathX (curated mathematical reasoning dataset)
	- Samples Used: ~300
	- Training Steps: 50
	- Method: GRPO (Group Relative Policy Optimization) fine-tuning
	- Objective: Reinforcement-style alignment for improved reasoning clarity and correctness.

	## Performance
	- Demonstrated strong performance on small-scale math problems and symbolic reasoning tasks.
	- Early benchmarks suggest improved accuracy compared to the base Gemma 1B model on math-specific datasets.
	- Requires formal evaluation on GSM8K, MATH, and other benchmarks for quantitative comparison.

	## Limitations
	- Small dataset and limited training steps mean coverage is narrow.
	- May overfit to MathX patterns and fail on broader or more complex problems.
	- Not guaranteed to generalize outside mathematical reasoning.
	- As a 1B model, capacity is limited compared to larger LLMs.

	## Ethical Considerations
	- Intended for safe educational use.
	- Should not be deployed in high-stakes environments without further validation.
	- Outputs may contain errors; human oversight is required.

	## Citation
	If you use this model, please cite as:
	```
	@misc{Parvesh2025M1MathX,
	author = {Parvesh Rawal},
	title = {Parveshiiii/M1-MathX: A Gemma-1B model fine-tuned on MathX with GRPO},
	year = {2025},
	howpublished = {\url{https://huggingface.co/Parveshiiii/M1-MathX}}
	}
	```

	---