Are LLMs Vulnerable to Preference-Undermining Attacks (PUA)? A Factorial Analysis Methodology for Diagnosing the Trade-off between Preference Alignment and Real-World Validity Paper • 2601.06596 • Published 28 days ago • 12
Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach Paper • 2512.02834 • Published Dec 2, 2025 • 41