iamabhayaditya
/

EfficientMath-AI

Text Generation

custom-instruction

Model card Files Files and versions

EfficientMath-AI / README.md

iamabhayaditya's picture

Update README.md

5441ab7 verified 9 days ago

|

history blame contribute delete

2.63 kB

	---
	base_model: meta-llama/Meta-Llama-3.1-8B
	library_name: peft
	license: llama3.1
	datasets:
	- openai/gsm8k
	language:
	- en
	metrics:
	- accuracy
	- perplexity
	pipeline_tag: text-generation
	tags:
	- llama.cpp
	- unsloth
	- transformers
	- math
	- custom-instruction
	- LoRA
	---

	# 🧮 EfficientMath-AI (Llama 3.1 8B)

	## 📌 Project Overview
	EfficientMath-AI is a parameter-efficient fine-tuned (PEFT) version of Meta's Llama-3.1-8B, specifically optimized to solve multi-step, grade-school math word problems. It was trained using LoRA (Low-Rank Adaptation) and compressed into a 4-bit quantized GGUF format, allowing it to perform high-level mathematical reasoning efficiently on standard CPU hardware.

	Creator: Abhay Aditya
	Live Interactive Demo: [EfficientMath-AI Web App](https://huggingface.co/spaces/iamabhayaditya/EfficientMath-AI)

	## 🧠 Model Details
	* Base Model: `meta-llama/Meta-Llama-3.1-8B`
	* Fine-Tuning Method: LoRA (Rank = 16, Alpha = 16) via Unsloth
	* Dataset: GSM8K (Grade School Math 8K)
	* Quantization: `Q4_K_M` (4-bit GGUF)
	* Parameters: 8 Billion
	* Deployment Context: Designed for high-speed, CPU-only inference via `llama.cpp`.

	## 📊 Evaluation & Performance
	The model was evaluated against a rigorous test split of the GSM8K dataset, focusing on strict numeric extraction and step-by-step reasoning coherence.
	* Overall Accuracy: 66%
	* Training Hardware: Single NVIDIA T4 GPU (Free Tier)
	* Inference Hardware Requirement: ~8GB RAM (Basic CPU)

	![Research Graphs](research_graphs.jpg)

	### Diagnostic Insights:
	1. Perplexity: The model exhibits a tightly clustered, low perplexity distribution (between 2.5 and 4.0), demonstrating high confidence and fluency in generating mathematical syntax.
	2. Complexity Ceiling: The model achieves near 80% accuracy on short word problems, maintaining a concise and highly accurate "Chain of Thought" without hallucinating verbose responses. Like many 8B class models, accuracy scales inversely with prompt length on highly complex, multi-paragraph logic puzzles.

	## 💻 Usage Example (Python)
	If you wish to run this model locally, you can use `llama-cpp-python`:

	```python
	from llama_cpp import Llama

	llm = Llama(
	model_path="Meta-Llama-3.1-8B.Q4_K_M.gguf",
	n_ctx=2048,
	n_threads=4
	)

	output = llm(
	"Below is a math word problem. Solve it step by step and provide the final answer.\n\n### Problem:\nIf the cost of 18 apples is 90 rupees, what is the cost of 24 apples?\n\n### Solution:\n",
	max_tokens=256,
	temperature=0.2,
	stop=["<\|eot_id\|>"]
	)
	print(output["choices"][0]["text"])