arxiv:2606.24428

Escaping the Self-Confirmation Trap: An Execute-Distill-Verify Paradigm for Agentic Experience Learning

Published on Jun 23

· Submitted by

zhushiding on Jun 24

Zhejiang University

Upvote

Authors:

Abstract

EDV is a three-stage framework that uses multiple heterogeneous agents to collaboratively construct reliable experiences for LLM agents, preventing self-confirmatory errors through execute-distill-verify processes.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

Experience-driven self-evolution is critical for large language model (LLM) agents to improve through open-world interaction. However, existing experience learning methods mostly rely on single-agent loops, where the same agent executes tasks, summarizes outcomes, and determines memory content. This setup makes agents vulnerable to the Self-Confirmation Trap: wrong-but-self-consistent trajectories are misidentified as successful experience, leading to cumulative errors during retrieval and reuse. To address this issue, we propose EDV, an Execute-Distill-Verify framework for reliable experience learning. In the Execute stage, multiple heterogeneous agents explore the same task space in parallel to generate diverse candidate trajectories. In the Distill stage, a dedicated third-party agent comparatively analyzes these trajectories to produce candidate experiences, reducing executor-centric summarization bias. In the Verify stage, the execution group validates candidates via a consensus mechanism, and only approved experiences are written into shared or private memory. By decoupling the three stages, EDV transforms experience learning from isolated self-reflection into collaborative construction, filtering erroneous and noisy content before memory insertion. We evaluate EDV on three challenging long-horizon benchmarks: tau2-bench, Mind2Web and MMTB. Results show EDV consistently outperforms strong baselines, validating that reliable experience construction is essential for robust agent self-evolution. Our code is available at https://github.com/shidingz/EDV.

View arXiv page View PDF Add to collection

Community

zhushiding

Paper submitter about 7 hours ago

Experience-driven self-evolution is essential for large language model (LLM) agents to improve through interaction with open-world environments. However, existing experience learning methods largely rely on single-agent loops, in which the same agent executes tasks, summarizes outcomes, and decides what should be written into memory. In such settings, agents are prone to the Self-Confirmation Trap, where wrong-but-self-consistent trajectories are mistakenly treated as successful experience, leading to error accumulation through later retrieval and reuse. To address this challenge, we propose EDV, an Execute-Distill-Verify framework for reliable experience learning. In the Execute stage, multiple heterogeneous agents explore the same task space in parallel, generating diverse candidate trajectories. In the Distill stage, a designated third-party distillation agent comparatively analyzes these trajectories and produces candidate experiences, reducing the bias of executor-centric self-summarization. In the Verify stage, the execution group jointly validates candidate experiences through a consensus-based mechanism, and only experiences that pass strict validation are written into shared or private memory. By decoupling execution, distillation, and validation, EDV turns experience learning from an isolated self-reflection loop into a collaborative experience construction process that suppresses erroneous and noisy experi- ence before memory insertion. We evaluate EDV on challenging long-horizon benchmarks, including τ2-bench, Mind2Web, and MMTB. Experimental results show that EDV consis- tently outperforms strong baselines, demonstrating the value of improving the reliability of experience construction for agent self-evolution. These findings suggest that robust agent improvement depends not only on richer memory, but also on how experience is constructed before it enters memory. Our code is available at https://github.com/shidingz/EDV.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2606.24428

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.24428 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.24428 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.24428 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.