arxiv:2604.05172
Wenbo Chen
wenbochen111
AI & ML interests
LLM
Recent Activity
authored a paper about 14 hours ago
ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces upvoted a paper about 16 hours ago
ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces authored a paper about 2 months ago
SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse TasksOrganizations
None yet