GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Paper
• 2210.17323 • Published
• 10
4-bit GPTQ quantized version of utter-project/EuroLLM-9B-Instruct for inference with the Private LLM app.
Base model
meta-llama/Llama-3.1-8B