QuixiAI
/

Llama-3.2-1B-FP8-Dynamic

Text Generation

text-generation-inference

compressed-tensors

Model card Files Files and versions

Llama-3.2-1B-FP8-Dynamic / README.md

ehartford's picture

Create README.md

ddb92be verified 2 months ago

|

history blame contribute delete

732 Bytes

	---
	base_model: meta-llama/Llama-3.2-1B-Instruct
	language:
	- en
	library_name: transformers
	license: llama3.2
	tags:
	- llama-3
	- llama
	- meta
	- facebook
	- transformers
	---

	Quantizing Llama-3.2-1B
	Eric Hartford

	I am creating several quants of Llama-3.1-1B for the purposes of testing vLLM Marlin.

	- https://huggingface.co/QuixiAI/Llama-3.2-1B
	- https://huggingface.co/QuixiAI/Llama-3.2-1B-FP8-Dynamic
	- https://huggingface.co/QuixiAI/Llama-3.2-1B-MXFP4
	- https://huggingface.co/QuixiAI/Llama-3.2-1B-NVFP4A16
	- https://huggingface.co/QuixiAI/Llama-3.2-1B-W4A16-AWQ
	- https://huggingface.co/QuixiAI/Llama-3.2-1B-W4A16-GPTQ
	- https://huggingface.co/QuixiAI/Llama-3.2-1B-W8A16-GPTQ

	The script I used to quant this:
	[quant.py](quant.py)