boapro commited on
Commit
723816a
·
verified ·
1 Parent(s): 918d285

Upload 3 files

Browse files
Files changed (4) hide show
  1. .gitattributes +1 -0
  2. README.md +59 -3
  3. WRT_Llama-3.1-2-8B-Q4_K_S.gguf +3 -0
  4. gitattributes +60 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ WRT_Llama-3.1-2-8B-Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,59 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: meta-llama/Llama-3.1-8B
3
+ license: mit
4
+ pipeline_tag: text-generation
5
+ tags:
6
+ - Llama-3
7
+ - finetune
8
+ quantized_by: boapro
9
+ ---
10
+
11
+ ## Llamacpp imatrix Quantizations of meta-llama/Llama-3.1-8B
12
+ Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3878">b3878</a> for quantization.
13
+
14
+ Original model: https://huggingface.co/meta-llama/Llama-3.1-8B
15
+
16
+
17
+ Run it in [LM Studio](https://lmstudio.ai/)
18
+
19
+ ## Prompt format
20
+
21
+ ```
22
+ <|begin_of_text|><|start_header_id|>system<|end_header_id|>
23
+
24
+ {system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>
25
+
26
+ {prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
27
+ ```
28
+
29
+
30
+
31
+ ## Downloading using huggingface-cli
32
+
33
+ First, make sure you have hugginface-cli installed:
34
+
35
+ ```
36
+ pip install -U "huggingface_hub[cli]"
37
+ ```
38
+
39
+ Then, you can target the specific file you want:
40
+
41
+
42
+
43
+ If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
44
+
45
+
46
+ You can either specify a new local-dir (boapro/WRT_II) or download them all in place (./)
47
+
48
+ ## Q4_0_X_X
49
+
50
+
51
+ If you're using an ARM chip, the Q4_0_X_X quants will have a substantial speedup. Check out Q4_0_4_4 speed comparisons [on the original pull request](https://github.com/ggerganov/llama.cpp/pull/5780#pullrequestreview-21657544660)
52
+
53
+ To check which one would work best for your ARM chip, you can check [AArch64 SoC features](https://gpages.juszkiewicz.com.pl/arm-socs-table/arm-socs.html) (thanks EloyOn!).
54
+
55
+
56
+ If you want to get more into the weeds, you can check out this extremely useful feature chart:
57
+
58
+ [llama.cpp feature matrix](https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix)
59
+
WRT_Llama-3.1-2-8B-Q4_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f782e8204a53d13a8e365deefea10dc2e29a27cd209034a5066d3c0c45ee1f60
3
+ size 4692669920
gitattributes ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
37
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q6_K_L.gguf filter=lfs diff=lfs merge=lfs -text
38
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
39
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q5_K_L.gguf filter=lfs diff=lfs merge=lfs -text
40
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
41
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
42
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q4_K_L.gguf filter=lfs diff=lfs merge=lfs -text
43
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
44
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
45
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q4_0_8_8.gguf filter=lfs diff=lfs merge=lfs -text
46
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q4_0_4_8.gguf filter=lfs diff=lfs merge=lfs -text
47
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q4_0_4_4.gguf filter=lfs diff=lfs merge=lfs -text
48
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
49
+ Llama-3.1-WhiteRabbitNeo-2-8B-IQ4_XS.gguf filter=lfs diff=lfs merge=lfs -text
50
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q3_K_XL.gguf filter=lfs diff=lfs merge=lfs -text
51
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q3_K_L.gguf filter=lfs diff=lfs merge=lfs -text
52
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
53
+ Llama-3.1-WhiteRabbitNeo-2-8B-IQ3_M.gguf filter=lfs diff=lfs merge=lfs -text
54
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q3_K_S.gguf filter=lfs diff=lfs merge=lfs -text
55
+ Llama-3.1-WhiteRabbitNeo-2-8B-IQ3_XS.gguf filter=lfs diff=lfs merge=lfs -text
56
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q2_K_L.gguf filter=lfs diff=lfs merge=lfs -text
57
+ Llama-3.1-WhiteRabbitNeo-2-8B-Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
58
+ Llama-3.1-WhiteRabbitNeo-2-8B-IQ2_M.gguf filter=lfs diff=lfs merge=lfs -text
59
+ Llama-3.1-WhiteRabbitNeo-2-8B-f16.gguf filter=lfs diff=lfs merge=lfs -text
60
+ Llama-3.1-WhiteRabbitNeo-2-8B.imatrix filter=lfs diff=lfs merge=lfs -text