Do LLMs Have Political Correctness? Analyzing Ethical Biases and Jailbreak Vulnerabilities in AI Systems Paper • 2410.13334 • Published Oct 17, 2024 • 12
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.15k
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models Paper • 2410.01524 • Published Oct 2, 2024 • 3