1165 296

siyeng feng

siyengfeng

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

CoffeeBench: Benchmarking Long-Horizon LLM Agents in Heterogeneous Multi-Agent Economies

upvoted a paper 3 days ago

The Verification Horizon: No Silver Bullet for Coding Agent Rewards

upvoted a paper 3 days ago

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

View all activity

Organizations

None yet

upvoted 3 papers 3 days ago

CoffeeBench: Benchmarking Long-Horizon LLM Agents in Heterogeneous Multi-Agent Economies

Paper • 2606.16613 • Published 14 days ago • 8

The Verification Horizon: No Silver Bullet for Coding Agent Rewards

Paper • 2606.26300 • Published 5 days ago • 41

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

Paper • 2606.26790 • Published 4 days ago • 47

upvoted 3 papers 4 days ago

upvoted 7 papers 6 days ago

Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation

Paper • 2606.18844 • Published 12 days ago • 18

EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions

Paper • 2606.23654 • Published 7 days ago • 79

PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems

Paper • 2606.22388 • Published 8 days ago • 95

EvoEmbedding: Evolvable Representations for Long-Context Retrieval and Agentic Memory

Paper • 2606.21649 • Published 10 days ago • 32

SkillHarness: Harnessing Safe Skills for Computer-Use Agents

Paper • 2606.20636 • Published 27 days ago • 20

Notes2Skills: From Lab Notebooks to Certainty-Aware Scientific Agent Skills

Paper • 2606.11897 • Published 19 days ago • 11

Self-Compacting Language Model Agents

Paper • 2606.23525 • Published 7 days ago • 18

upvoted 7 papers 16 days ago

Recursive Agent Optimization

Paper • 2605.06639 • Published May 7 • 1

Natural-Language Agent Harnesses

Paper • 2603.25723 • Published Mar 26 • 27

EvoTrainer: Co-Evolving LLM Policies and Training Harnesses for Autonomous Agentic Reinforcement Learning

Paper • 2606.03108 • Published 27 days ago • 11

Toward Generalist Autonomous Research via Hypothesis-Tree Refinement

Paper • 2606.11926 • Published 19 days ago • 120

Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks

Paper • 2606.12344 • Published 19 days ago • 70

EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments

Paper • 2606.13681 • Published 18 days ago • 142

HarnessBridge: Learnable Bidirectional Controller for LLM Agent Harness

Paper • 2606.12882 • Published 18 days ago • 13

siyeng feng

AI & ML interests

Recent Activity

Organizations

siyengfeng's activity