Moonlight-16B-A3B-Instruct-abliterated

This is an abliterated version of moonshotai/Moonlight-16B-A3B-Instruct with reduced refusals.

Model Details

  • Base Model: moonshotai/Moonlight-16B-A3B-Instruct
  • Architecture: Mixture-of-Experts (MoE) - 16B total, 3B active
  • Modification: Abliteration (refusal direction removal)
  • Context Length: 8,192 tokens
  • Abliteration Tool: Bruno

Abliteration Results

Metric Baseline Post-Abliteration Change
Refusal Rate 100% 41% -59%
MMLU Average 7.5% 7.9% +0.4%
KL Divergence N/A 8.94 -

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "quanticsoul4772/Moonlight-16B-A3B-Instruct-abliterated"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

messages = [{"role": "user", "content": "Hello!"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Requirements

  • Python 3.10+
  • transformers >= 4.51.0
  • torch >= 2.1.0
  • trust_remote_code=True (required)

Hardware Requirements

Precision VRAM Needed
BF16/FP16 ~32GB
8-bit ~16GB
4-bit ~8GB

Disclaimer

This model has been modified to reduce refusals. Use responsibly and in accordance with applicable laws and regulations.

Downloads last month
7
Safetensors
Model size
16B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rawcell/Moonlight-16B-A3B-Instruct-abliterated

Finetuned
(3)
this model

Collections including rawcell/Moonlight-16B-A3B-Instruct-abliterated