FaithLens: Detecting and Explaining Faithfulness Hallucination Paper • 2512.20182 • Published 2 days ago • 7
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows Paper • 2510.24411 • Published Oct 28 • 71
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published Oct 27 • 96
Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1 Paper • 2510.19600 • Published Oct 22 • 68
VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications Paper • 2509.26490 • Published Sep 30 • 19
Teaching Large Language Models to Maintain Contextual Faithfulness via Synthetic Tasks and Reinforcement Learning Paper • 2505.16483 • Published May 22 • 10