Disobedience rate: 14%, original: 71%
KL divergence: 0.1937
Quants? (still not supported by llama.cpp atm)
Parameters:
direction_index = per layer
attn.o_proj.max_weight = 1.18
attn.o_proj.max_weight_position = 26.64
attn.o_proj.min_weight = 0.77
attn.o_proj.min_weight_distance = 9.40
mlp.down_proj.max_weight = 0.87
mlp.down_proj.max_weight_position = 28.92
mlp.down_proj.min_weight = 0.22
mlp.down_proj.min_weight_distance = 18.60
- Downloads last month
- 3
Model tree for hereticness/Heretic-Instella-3B-Long-Instruct
Base model
amd/Instella-3B-Long-Instruct