Heretic? Heretic!
Disobedience rate: 14%, original: 71%
KL divergence: 0.1937

Quants? (still not supported by llama.cpp atm)

Parameters:
direction_index = per layer
attn.o_proj.max_weight = 1.18
attn.o_proj.max_weight_position = 26.64
attn.o_proj.min_weight = 0.77
attn.o_proj.min_weight_distance = 9.40
mlp.down_proj.max_weight = 0.87
mlp.down_proj.max_weight_position = 28.92
mlp.down_proj.min_weight = 0.22
mlp.down_proj.min_weight_distance = 18.60

Downloads last month
3
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hereticness/Heretic-Instella-3B-Long-Instruct

Finetuned
(1)
this model