Ministral-8B-Instruct-2410- GPTQ

Model creator: Mistral AI
Original model: Ministral-8B-Instruct

The model published in this repo was quantized to 4bit using GPTQModel.

Quantization details

All quantization parameters were taken from GPTQ paper.

GPTQ calibration data consisted of 128 random 2048 token segments from the C4 dataset.

The grouping size used for quantization is equal to 128. Other parameters can be found in quantize_config file: https://huggingface.co/iproskurina/Ministral-8B-Instruct-2410-gptqmodel-4bit/blob/main/quantize_config.json

How to use this GPTQ model from Python code

Install the necessary packages

Requires: GPTQModel v4 or later. Installation details: https://github.com/ModelCloud/GPTQModel?tab=readme-ov-file#install.

pip install -v gptqmodel --no-build-isolation

Run the model with GPTQModel

GPTQModel package: https://github.com/ModelCloud/GPTQModel

from gptqmodel import GPTQModel

model_id = 'iproskurina/Ministral-8B-Instruct-2410'
model = GPTQModel.load(model_id)
result = model.generate("Uncovering deep insights")[0] # tokens
print(model.tokenizer.decode(result)) # string output

Downloads last month: 7

Safetensors

Model size

8B params

Tensor type

I32

BF16

Model tree for iproskurina/Ministral-8B-Instruct-2410-gptqmodel-4bit

Base model

mistralai/Ministral-8B-Instruct-2410

Quantized

(69)

this model

Paper for iproskurina/Ministral-8B-Instruct-2410-gptqmodel-4bit

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Paper • 2210.17323 • Published Oct 31, 2022 • 10