Temperature-scaling surprisal estimates improve fit to human reading times -- but does it do so for the "right reasons"? Paper โข 2311.09325 โข Published Nov 15, 2023
FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings Paper โข 2501.06645 โข Published Jan 11, 2025
Multimodal Pragmatic Jailbreak on Text-to-image Models Paper โข 2409.19149 โข Published Sep 27, 2024