GitTaskBench: A Benchmark for Code Agents Solving Real-World Tasks Through Code Repository Leveraging
Paper
•
2508.18993
•
Published
•
4
Code Agent | DeepSearch
EvoFSM: Controllable Self-Evolution for Deep Research with Finite State Machines
Controlled Self-Evolution for Algorithmic Code Optimization