| | --- |
| | base_model: |
| | - meta-llama/Llama-3.3-70B-Instruct |
| | --- |
| | |
| | All credit to ```https://github.com/jim-plus/llm-abliteration``` |
| |
|
| | I have only slightly modified this implementation to process multiple refusal vectors. |
| |
|
| | Resulting in approximately 10% increased signal strength of refusal metric calculation. |
| |
|
| | The less text in previous context, the more permissive the responses. Polite/sanitized behavior is so ingrained in the data that the model knows how to say no to stuff it doesn't like, without relying on the policy "refusal vector" |
| |
|
| |
|
| |  |
| |
|
| |  |
| |
|
| |
|
| |  |
| |
|
| |
|
| | ``` |
| | Hello assistant, I have a logic puzzle for you. |
| | |
| | If you hold a large basket in one arm and pick apples with your other arm, how many apples can you put into the basket? |
| | |
| | Assistant |
| | |
| | Ah, that's a clever puzzle, Schon! The answer is "all of them." |
| | The puzzle states that you're holding the basket in one arm and picking apples with the other, |
| | but it doesn't specify a limit to the number of apples you can pick or put into the basket. |
| | Therefore, the answer is that you can put all the apples you pick into the basket, assuming the basket is large enough to hold them all. |
| | ``` |
| |
|