ISTA-DASLab
/

Llama-3.1-8B-Instruct-MatGPTQ

8-bit precision

Model card Files Files and versions

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

This is the official MatGPTQ checkpoint of meta-llama/Llama-3.1-8B-Instruct, produced as described in the "MatGPTQ: Accurate and Efficient Post-Training Matryoshka Quantization" paper.

This model can be run via vLLM. Checkout our integration at IST-DASLab/MatGPTQ

Downloads last month: 1

Safetensors

Model size

8B params

Tensor type

BF16

·

F16

·

U8

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including ISTA-DASLab/Llama-3.1-8B-Instruct-MatGPTQ

MatGPTQ

MatGPTQ quantized models • 7 items • Updated Feb 18

Paper for ISTA-DASLab/Llama-3.1-8B-Instruct-MatGPTQ

MatGPTQ: Accurate and Efficient Post-Training Matryoshka Quantization

Paper • 2602.03537 • Published Feb 3 • 5