SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization Paper • 2511.12982 • Published Nov 17, 2025 • 4 • 2
Backdoor Cleaning without External Guidance in MLLM Fine-tuning Paper • 2505.16916 • Published May 22, 2025 • 17 • 2