arxiv:2505.02387
Xiusi Chen
XtremSup
AI & ML interests
None yet
Recent Activity
upvoted a paper about 8 hours ago
PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems upvoted a paper about 2 months ago
CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing