Tyler Williams commited on
Commit
31e8029
·
1 Parent(s): 40d65c2

Add Q4_K_M GGUF quantization (4.7GB)

Browse files

Recommended for production use with llama.cpp and Ollama.
Maintains full quality while being 3.4x smaller than FP16.

Files changed (2) hide show
  1. .gitattributes +1 -0
  2. gguf/wraith-8b-Q4_K_M.gguf +3 -0
.gitattributes CHANGED
@@ -8,3 +8,4 @@
8
  *.pt filter=lfs diff=lfs merge=lfs -text
9
  *.pth filter=lfs diff=lfs merge=lfs -text
10
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
 
8
  *.pt filter=lfs diff=lfs merge=lfs -text
9
  *.pth filter=lfs diff=lfs merge=lfs -text
10
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
11
+ gguf/*.gguf filter=lfs diff=lfs merge=lfs -text
gguf/wraith-8b-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a6e77450d8073013067365d97cdbf5d24b036588288381c92caf165e444fb2bb
3
+ size 4920738720