iamabhayaditya commited on
Commit
968153d
·
verified ·
1 Parent(s): 9cc09c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -13
README.md CHANGED
@@ -1,20 +1,67 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
2
  tags:
3
- - gguf
4
- - llama.cpp
5
- - unsloth
6
-
 
 
 
7
  ---
8
 
9
- # EfficientMath-AI : GGUF
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
 
11
- This model was finetuned and converted to GGUF format using [Unsloth](https://github.com/unslothai/unsloth).
 
12
 
13
- **Example usage**:
14
- - For text only LLMs: `llama-cli -hf iamabhayaditya/EfficientMath-AI --jinja`
15
- - For multimodal models: `llama-mtmd-cli -hf iamabhayaditya/EfficientMath-AI --jinja`
 
 
16
 
17
- ## Available Model files:
18
- - `Meta-Llama-3.1-8B.Q4_K_M.gguf`
19
- This was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth)
20
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
1
  ---
2
+ base_model: meta-llama/Meta-Llama-3.1-8B
3
+ library_name: peft
4
+ license: llama3.1
5
+ datasets:
6
+ - openai/gsm8k
7
+ language:
8
+ - en
9
+ metrics:
10
+ - accuracy
11
+ - perplexity
12
+ pipeline_tag: text-generation
13
  tags:
14
+ - gguf
15
+ - llama.cpp
16
+ - unsloth
17
+ - transformers
18
+ - math
19
+ - lora
20
+ - custom-instruction
21
  ---
22
 
23
+ # 🧮 EfficientMath-AI (Llama 3.1 8B)
24
+
25
+ ## 📌 Project Overview
26
+ EfficientMath-AI is a parameter-efficient fine-tuned (PEFT) version of Meta's **Llama-3.1-8B**, specifically optimized to solve multi-step, grade-school math word problems. It was trained using LoRA (Low-Rank Adaptation) and compressed into a 4-bit quantized GGUF format, allowing it to perform high-level mathematical reasoning efficiently on standard CPU hardware.
27
+
28
+ **Creator:** Abhay Aditya
29
+ **Live Interactive Demo:** [EfficientMath-AI Web App](https://huggingface.co/spaces/iamabhayaditya/EfficientMath-AI)
30
+
31
+ ## 🧠 Model Details
32
+ * **Base Model:** `meta-llama/Meta-Llama-3.1-8B`
33
+ * **Fine-Tuning Method:** LoRA (Rank = 16, Alpha = 16) via Unsloth
34
+ * **Dataset:** GSM8K (Grade School Math 8K)
35
+ * **Quantization:** `Q4_K_M` (4-bit GGUF)
36
+ * **Parameters:** 8 Billion
37
+ * **Deployment Context:** Designed for high-speed, CPU-only inference via `llama.cpp`.
38
+
39
+ ## 📊 Evaluation & Performance
40
+ The model was evaluated against a rigorous test split of the GSM8K dataset, focusing on strict numeric extraction and step-by-step reasoning coherence.
41
+ * **Overall Accuracy:** 66%
42
+ * **Training Hardware:** Single NVIDIA T4 GPU (Free Tier)
43
+ * **Inference Hardware Requirement:** ~8GB RAM (Basic CPU)
44
+
45
+ ### Diagnostic Insights:
46
+ 1. **Perplexity:** The model exhibits a tightly clustered, low perplexity distribution (between 2.5 and 4.0), demonstrating high confidence and fluency in generating mathematical syntax.
47
+ 2. **Complexity Ceiling:** The model achieves near 80% accuracy on short word problems, maintaining a concise and highly accurate "Chain of Thought" without hallucinating verbose responses. Like many 8B class models, accuracy scales inversely with prompt length on highly complex, multi-paragraph logic puzzles.
48
+
49
+ ## 💻 Usage Example (Python)
50
+ If you wish to run this model locally, you can use `llama-cpp-python`:
51
 
52
+ ```python
53
+ from llama_cpp import Llama
54
 
55
+ llm = Llama(
56
+ model_path="Meta-Llama-3.1-8B.Q4_K_M.gguf",
57
+ n_ctx=2048,
58
+ n_threads=4
59
+ )
60
 
61
+ output = llm(
62
+ "Below is a math word problem. Solve it step by step and provide the final answer.\n\n### Problem:\nIf the cost of 18 apples is 90 rupees, what is the cost of 24 apples?\n\n### Solution:\n",
63
+ max_tokens=256,
64
+ temperature=0.2,
65
+ stop=["<|eot_id|>"]
66
+ )
67
+ print(output["choices"][0]["text"])