AgentDoG1.5 Collection A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security • 11 items • Updated 4 days ago • 13
AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security Paper • 2605.29801 • Published 28 days ago • 144
LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety Paper • 2604.12710 • Published Apr 13 • 5
Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen! Paper • 2505.15656 • Published May 21, 2025 • 15
How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study Paper • 2505.15404 • Published May 21, 2025 • 13
BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs Paper • 2505.13529 • Published May 18, 2025 • 12