Qwen3-Quantization This is the official quantized models collection of Qwen3 Quantization Efficient-ML/Qwen3-0.6B-base-gptq-w4-128 Updated May 5 Efficient-ML/Qwen3-0.6B-base-gptq-w8-128 Updated May 5 Efficient-ML/Qwen3-0.6B-base-gptq-w8-perchannel Updated May 5 Efficient-ML/Qwen3-0.6B-base-gptq-w4-perchannel Updated May 5
LLaMA3-Quantization This is the official quantized models collection of “How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study” Efficient-ML/LLaMA-3-8B-GPTQ-4bit-b128 Updated Apr 21, 2024 • 3 Efficient-ML/LLaMA-3-8B-SmoothQuant-4bit-4bit Text Generation • 8B • Updated Apr 22, 2024 • 8 Efficient-ML/LLaMA-3-8B-AWQ-4bit-b128 Text Generation • Updated Apr 28, 2024 • 7 Efficient-ML/LLaMA-3-8B-SmoothQuant-8bit-8bit Text Generation • 8B • Updated Apr 22, 2024 • 11
Qwen3-Quantization This is the official quantized models collection of Qwen3 Quantization Efficient-ML/Qwen3-0.6B-base-gptq-w4-128 Updated May 5 Efficient-ML/Qwen3-0.6B-base-gptq-w8-128 Updated May 5 Efficient-ML/Qwen3-0.6B-base-gptq-w8-perchannel Updated May 5 Efficient-ML/Qwen3-0.6B-base-gptq-w4-perchannel Updated May 5
LLaMA3-Quantization This is the official quantized models collection of “How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study” Efficient-ML/LLaMA-3-8B-GPTQ-4bit-b128 Updated Apr 21, 2024 • 3 Efficient-ML/LLaMA-3-8B-SmoothQuant-4bit-4bit Text Generation • 8B • Updated Apr 22, 2024 • 8 Efficient-ML/LLaMA-3-8B-AWQ-4bit-b128 Text Generation • Updated Apr 28, 2024 • 7 Efficient-ML/LLaMA-3-8B-SmoothQuant-8bit-8bit Text Generation • 8B • Updated Apr 22, 2024 • 11