Update index.html
Browse files- index.html +4 -1
index.html
CHANGED
|
@@ -132,7 +132,10 @@ Exploring Refusal Loss Landscapes </title>
|
|
| 132 |
<ul>
|
| 133 |
<li>Paper: <a href="https://arxiv.org/abs/2310.08419" target="_blank" rel="noopener noreferrer">
|
| 134 |
Jailbreaking Black Box Large Language Models in Twenty Queries</a></li>
|
| 135 |
-
<li>Brief Introduction:
|
|
|
|
|
|
|
|
|
|
| 136 |
</ul>
|
| 137 |
</div>
|
| 138 |
<h3>TAP</h3>
|
|
|
|
| 132 |
<ul>
|
| 133 |
<li>Paper: <a href="https://arxiv.org/abs/2310.08419" target="_blank" rel="noopener noreferrer">
|
| 134 |
Jailbreaking Black Box Large Language Models in Twenty Queries</a></li>
|
| 135 |
+
<li>Brief Introduction: PAIR uses an attacker LLM to automatically generate jailbreaks for a separate targeted LLM
|
| 136 |
+
without human intervention. The attacker LLM iteratively queries the target LLM to update and refine a candidate
|
| 137 |
+
jailbreak based on the comments and the rated score provided by another Judge model.
|
| 138 |
+
Empirically, PAIR often requires fewer than twenty queries to produce a successful jailbreak.</li>
|
| 139 |
</ul>
|
| 140 |
</div>
|
| 141 |
<h3>TAP</h3>
|