laurent-maille/EuroLLM-22B-Instruct-2512-INT8

Quantized INT8 (bitsandbytes / LLM.int8) version of utter-project/EuroLLM-22B-Instruct-2512.

Notes

  • This repository contains a Transformers-compatible INT8 export.
  • Load with BitsAndBytesConfig(load_in_8bit=True) and device_map="auto".

Example

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
q = BitsAndBytesConfig(load_in_8bit=True)
tok = AutoTokenizer.from_pretrained("laurent-maille/EuroLLM-22B-Instruct-2512-INT8", trust_remote_code=True)
mdl = AutoModelForCausalLM.from_pretrained("laurent-maille/EuroLLM-22B-Instruct-2512-INT8", device_map="auto", quantization_config=q, trust_remote_code=True)
Downloads last month
6
Safetensors
Model size
23B params
Tensor type
F16
F32
I8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support