Spaces:
Running
Running
update pub list
Browse files
README.md
CHANGED
|
@@ -24,3 +24,6 @@ Beijing Institute of AI Safety and Governance (Beijing-AISI) is dedicated to bui
|
|
| 24 |
- **[StressPrompt: Does Stress Impact Large Language Models and Human Performance Similarly?](https://arxiv.org/abs/2409.17167)** has been **published at AAAI 2025**!
|
| 25 |
This study introduces **StressPrompt**, a psychologically inspired benchmark for probing how LLMs respond under stress-inducing conditions. Results show that LLMs, like humans, follow the Yerkes-Dodson law—performing best under moderate stress. The findings offer new insights into LLM cognitive alignment, robustness, and deployment in high-stakes environments.
|
| 26 |
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
- **[StressPrompt: Does Stress Impact Large Language Models and Human Performance Similarly?](https://arxiv.org/abs/2409.17167)** has been **published at AAAI 2025**!
|
| 25 |
This study introduces **StressPrompt**, a psychologically inspired benchmark for probing how LLMs respond under stress-inducing conditions. Results show that LLMs, like humans, follow the Yerkes-Dodson law—performing best under moderate stress. The findings offer new insights into LLM cognitive alignment, robustness, and deployment in high-stakes environments.
|
| 26 |
|
| 27 |
+
- **[Harnessing Task Overload for Scalable Jailbreak Attacks on Large Language Models](https://arxiv.org/abs/2410.04190)** has been **posted on arXiv**!
|
| 28 |
+
This work proposes a **scalable jailbreak attack** that exploits task overload to bypass LLM safety mechanisms. By engaging models in resource-intensive preprocessing (e.g., character map decoding), the attack suppresses safety policy activation at inference time. Without requiring gradient access or handcrafted prompts, our method adapts to various model sizes and maintains high success rates—highlighting a critical vulnerability in current LLM safety designs under resource constraints.
|
| 29 |
+
|