CannaeAI commited on
Commit
c73ec2c
·
verified ·
1 Parent(s): 1652ce6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -1 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - unsloth/Llama-3.2-1B
4
+ tags:
5
+ - text-generation-inference
6
+ - transformers
7
+ - math
8
+ - conversational
9
+ - llama
10
+ - meta
11
+ license: apache-2.0
12
+ language:
13
+ - en
14
+ library_name: transformers
15
+ ---
16
+ # GsMath-Llama-1B
17
+ ## Model Description:
18
+ This is a fine-tuned version of [unsloth/Llama-3.2-1B](https://huggingface.co/unsloth/Llama-3.2-1B)!
19
+ - **recommended settings for inference:** min_p = 0.1 and temperature = 1.5 , Read this [Tweet](https://x.com/menhguin/status/1826132708508213629) to understand why.
20
+ - **License :** apache-2.0
21
+ - **Finetuned from model :** unsloth/Llama-3.2-1B
22
+ ## Benchmarks:
23
+ We evaluate both models on GSM8K using the standard lm-eval 5-shot exact-match protocol. Under identical decoding and extraction settings,GsMath-Llama-1B outperforms Meta’s Llama-3.2-1B by 2x,demonstrating an improvement in small-model mathematical capability.
24
+ | Model | Params | GSM8K (5-shot, EM) |
25
+ | ----------------------------- | ------ | ------------------ |
26
+ | **GsMath-Llama-1B** | 1B | **13.7%** |
27
+ | Llama-3.2-1B | 1B | 6.8% |
28
 
29
+
30
+ <p align="center">
31
+ <img alt="GsMath-Llama-1B" src="https://huggingface.co/Cannae-AI/ReasoningLlama-Math-1B-IT/resolve/main/ChatGPT%20Image%2018%20nov.%202025%2C%2020_55_23.png">
32
+ </p>