anthonym21 commited on
Commit
f0f5adc
·
verified ·
1 Parent(s): 009a643

Add GGUF quantizations (Q8_0, Q4_K_M)

Browse files
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ Eve-2-MoE-NanoExtract-272M-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
37
+ Eve-2-MoE-NanoExtract-272M-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
Eve-2-MoE-NanoExtract-272M-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cc0f4cf0a077532f0a7e01a5c3b6cef47e2dcd862a071b66d985678acd12a0b2
3
+ size 189484672
Eve-2-MoE-NanoExtract-272M-Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5441b04a744ed29029fae5a079b5dd3357361a1b2aca59d65d14ab47da246a51
3
+ size 290929792
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: anthonym21/Eve-2-MoE-NanoExtract-272M
3
+ tags:
4
+ - gguf
5
+ - quantized
6
+ - moe
7
+ - eve-2
8
+ license: apache-2.0
9
+ ---
10
+
11
+ # Eve-2-MoE-NanoExtract-272M - GGUF
12
+
13
+ GGUF quantizations of [anthonym21/Eve-2-MoE-NanoExtract-272M](https://huggingface.co/anthonym21/Eve-2-MoE-NanoExtract-272M).
14
+
15
+ ## Quantization Variants
16
+
17
+ | Quantization | Filename | Size |
18
+ |---|---|---|
19
+ | Q8_0 | Eve-2-MoE-NanoExtract-272M-Q8_0.gguf | 290.9 MB |
20
+ | Q4_K_M | Eve-2-MoE-NanoExtract-272M-Q4_K_M.gguf | 189.5 MB |
21
+
22
+ ## Usage with Ollama
23
+
24
+ ```bash
25
+ ollama run anthonym21/eve-2-moe-nanoextract-272m
26
+ ```
27
+
28
+ ## Usage with llama.cpp
29
+
30
+ ```bash
31
+ llama-cli -m Eve-2-MoE-NanoExtract-272M-Q4_K_M.gguf -p "Your prompt here"
32
+ ```
33
+
34
+ ## Architecture
35
+
36
+ - **Type**: DeepSeek-style Mixture of Experts (MoE)
37
+ - **Parameters**: 272M total
38
+ - **Layers**: 12
39
+ - **Hidden dim**: 512
40
+ - **Experts**: 8 routed (top-2) + 1 shared per layer
41
+ - **Context**: 2048 tokens
42
+ - **Tokenizer**: GPT-2
43
+
44
+ ## Parent Model
45
+
46
+ This is a quantized version of [anthonym21/Eve-2-MoE-NanoExtract-272M](https://huggingface.co/anthonym21/Eve-2-MoE-NanoExtract-272M).