QuixiAI
/

Llama-3.2-1B-W4A16-AWQ

Text Generation

text-generation-inference

compressed-tensors

Model card Files Files and versions

Quantizing Llama-3.2-1B

Eric Hartford

I am creating several quants of Llama-3.1-1B for the purposes of testing vLLM Marlin.

The script I used to quant this: quant.py

Downloads last month: 9

Safetensors

Model size

0.7B params

Tensor type

I64

·

I32

·

BF16

·

Model tree for QuixiAI/Llama-3.2-1B-W4A16-AWQ

Base model

meta-llama/Llama-3.2-1B-Instruct

Quantized

(367)

this model