reaperdoesntknow commited on
Commit
d092982
·
verified ·
1 Parent(s): b563b64

Upload folder using huggingface_hub

Browse files
Files changed (4) hide show
  1. .gitattributes +2 -0
  2. README.md +61 -0
  3. SMOLM2Prover-Q4_K_M.gguf +3 -0
  4. SMOLM2Prover.gguf +3 -0
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ SMOLM2Prover-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
37
+ SMOLM2Prover.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SMOLM2Prover - GGUF Format
2
+
3
+ GGUF quantized version of the SMOLM2Prover model for use with llama.cpp and compatible runtimes.
4
+
5
+ ## Model Details
6
+
7
+ - **Original Model**: reaperdoesntknow/SMOLM2Prover
8
+ - **Architecture**: LlamaForCausalLM
9
+ - **Context Length**: 8192 tokens
10
+ - **Embedding Dimension**: 960
11
+ - **Layers**: 32
12
+ - **Head Count**: 15 (Q), 5 (KV) - GQA
13
+
14
+ ## Available Files
15
+
16
+ | File | Size | Quantization | Quality |
17
+ |------|------|--------------|---------|
18
+ | `SMOLM2Prover.gguf` | 692M | F16 | Original (no quantization) |
19
+ | `SMOLM2Prover-Q4_K_M.gguf` | 258M | Q4_K_M | Recommended (good quality/size balance) |
20
+
21
+ ## Usage
22
+
23
+ ### With llama.cpp
24
+
25
+ ```bash
26
+ # Run with the quantized model
27
+ ./llama-cli -m SMOLM2Prover-Q4_K_M.gguf -p "Your prompt here" -n 256
28
+ ```
29
+
30
+ ### With Ollama
31
+
32
+ Create a `Modelfile`:
33
+ ```
34
+ FROM ./SMOLM2Prover-Q4_K_M.gguf
35
+ ```
36
+
37
+ Then:
38
+ ```bash
39
+ ollama create smolm2prover -f Modelfile
40
+ ollama run smolm2prover
41
+ ```
42
+
43
+ ### With LM Studio
44
+
45
+ 1. Download `SMOLM2Prover-Q4_K_M.gguf`
46
+ 2. Place in LM Studio models folder
47
+ 3. Load and chat!
48
+
49
+ ## Quantization Details
50
+
51
+ The Q4_K_M quantization uses:
52
+ - Q4_K for most weights
53
+ - Q5_0 fallback for tensors not divisible by 256
54
+ - Q6_K/Q8_0 for some critical layers
55
+
56
+ **Size reduction**: 692M → 258M (63% smaller)
57
+ **BPW**: 5.94 bits per weight
58
+
59
+ ## License
60
+
61
+ Same as the original model.
SMOLM2Prover-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:55f1d76afef32d2a1c3e1d67cd8f7f464286f60ff146270c9eeb875f95f96bbc
3
+ size 270591136
SMOLM2Prover.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d350a1f1c1ce510879cb06d28850d43afe723dfcd5a3c8113a9de699c4f98ae1
3
+ size 725554336