NeuralDaredevil-8B-abliterated-NF4

Permanently quantized to NF4

This is a 4-bit NF4 quantized version of:
mlabonne/NeuralDaredevil-8B-abliterated
(fine-tuned & abliteration by mlabonne)

License: Meta Llama 3 Community License
Same as base model – commercial use allowed.


Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "ikarius/NeuralDaredevil-8B-abliterated-NF4"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.bfloat16,   # Optimal for Llama 3
    trust_remote_code=True
)

input_text = "Hva er abliteration?"
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=150,
    do_sample=True,
    temperature=0.7,
    top_p=0.9
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Quantization Details

'''

  • Original model: mlabonne/NeuralDaredevil-8B-abliterated
  • Architecture: Llama 3 8B
  • Quantization: bitsandbytes 4-bit NF4 (permanent)
  • Double quantization: Enabled
  • Compute dtype: bfloat16recommended for Llama 3
  • Estimated VRAM: ~6.3–6.7 GB (CUDA) '''

Credits

Base model & abliteration: @mlabonne NF4 quantization: @ikarius

License Same as base model: llama3

Downloads last month
3
Safetensors
Model size
8B params
Tensor type
F16
·
F32
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ikarius/NeuralDaredevil-8B-abliterated-NF4

Quantized
(30)
this model