| * Parameters: | |
| * direction_index = 20.66 | |
| * attn.o_proj.max_weight = 1.47 | |
| * attn.o_proj.max_weight_position = 30.94 | |
| * attn.o_proj.min_weight = 1.20 | |
| * attn.o_proj.min_weight_distance = 27.15 | |
| * mlp.down_proj.max_weight = 0.95 | |
| * mlp.down_proj.max_weight_position = 40.02 | |
| * mlp.down_proj.min_weight = 0.44 | |
| * mlp.down_proj.min_weight_distance = 16.24 | |
| * Resetting model... | |
| * Abliterating... | |
| * Evaluating... | |
| * Obtaining first-token probability distributions... | |
| * KL divergence: 0.0024 | |
| * Counting model refusals... | |
| * Refusals: 27/100 |