ehartford commited on
Commit
ddb92be
·
verified ·
1 Parent(s): 8118f7b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: meta-llama/Llama-3.2-1B-Instruct
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ license: llama3.2
7
+ tags:
8
+ - llama-3
9
+ - llama
10
+ - meta
11
+ - facebook
12
+ - transformers
13
+ ---
14
+
15
+ Quantizing Llama-3.2-1B
16
+ Eric Hartford
17
+
18
+ I am creating several quants of Llama-3.1-1B for the purposes of testing vLLM Marlin.
19
+
20
+ - https://huggingface.co/QuixiAI/Llama-3.2-1B
21
+ - https://huggingface.co/QuixiAI/Llama-3.2-1B-FP8-Dynamic
22
+ - https://huggingface.co/QuixiAI/Llama-3.2-1B-MXFP4
23
+ - https://huggingface.co/QuixiAI/Llama-3.2-1B-NVFP4A16
24
+ - https://huggingface.co/QuixiAI/Llama-3.2-1B-W4A16-AWQ
25
+ - https://huggingface.co/QuixiAI/Llama-3.2-1B-W4A16-GPTQ
26
+ - https://huggingface.co/QuixiAI/Llama-3.2-1B-W8A16-GPTQ
27
+
28
+ The script I used to quant this:
29
+ [quant.py](quant.py)