QuixiAI
/

Llama-3.2-1B-FP8-Dynamic

Text Generation

text-generation-inference

compressed-tensors

Model card Files Files and versions

Llama-3.2-1B-FP8-Dynamic / README.md

ehartford's picture

Create README.md

ddb92be verified 2 months ago

|

history blame contribute delete

732 Bytes

base_model: meta-llama/Llama-3.2-1B-Instruct
language:
  - en
library_name: transformers
license: llama3.2
tags:
  - llama-3
  - llama
  - meta
  - facebook
  - transformers

Quantizing Llama-3.2-1B Eric Hartford

I am creating several quants of Llama-3.1-1B for the purposes of testing vLLM Marlin.

The script I used to quant this: quant.py