Gemma 3 27B Abliterated - W8A8 INT8
Quantized version of mlabonne/gemma-3-27b-it-abliterated using W8A8.
Quantization Config
- Method: SmoothQuant + GPTQ
- Precision: 8-bit weights, 8-bit activations
- SmoothQuant: smoothing_strength=0.5
- GPTQ: scheme=W8A8, block_size=128
- Calibration: 512 samples from wikitext-2-raw-v1, max_seq_length=1024
- Model size: ~27 GB
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"neonconverse/gemma-3-27b-abliterated-w8a8-8bit",
device_map="auto",
torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained("neonconverse/gemma-3-27b-abliterated-w8a8-8bit")
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for neonconverse/gemma-3-27b-abliterated-w8a8-8bit
Base model
google/gemma-3-27b-pt Finetuned
google/gemma-3-27b-it Finetuned
mlabonne/gemma-3-27b-it-abliterated