LLaMA-3.2-1B-Math (QLoRA)

Model Summary

LLaMA-3.2-1B-Math is a compact math-reasoning model fine-tuned from LLaMA 3.2-1B using 4-bit QLoRA. It is designed to study domain-specialized adaptation under strict compute limits.

Model Details

  • Base model: LLaMA 3.2-1B
  • Architecture: Decoder-only Transformer
  • Fine-tuning: QLoRA (4-bit NF4)
  • Training environment: Kaggle (single-GPU)

Datasets

Dataset Purpose
GSM8K Grade-school arithmetic reasoning
MATH Competition-level mathematical problems

Datasets were processed into a plain-text format (no chat template) to match the base model.

Training Procedure

  • LoRA adapters inserted into attention and MLP projections
  • Mixed math datasets concatenated and split into train/eval
  • Optimized for stability on small batch sizes

Intended Use

  • Math reasoning benchmarks
  • Research on small-model specialization
  • Educational demonstrations of QLoRA fine-tuning

Out-of-Scope Uses

  • General chat or instruction following
  • Safety-critical or production systems

Known Limitations

  • Struggles with long proofs and symbolic manipulation
  • Sensitive to prompt phrasing
  • No RLHF or instruction tuning applied

Ethical Considerations

This model inherits biases and limitations from its base model and training data.

Acknowledgements

  • LLaMA (Meta)
  • GSM8K (OpenAI)
  • MATH (fdyrd/MATH)
Downloads last month
3
Safetensors
Model size
1B params
Tensor type
F16
·
Inference Providers NEW