Papers
arxiv:2602.10522

Consistency Meets Verification: Enhancing Test Generation Quality in Large Language Models Without Ground-Truth Solutions

Published on Feb 11
Authors:
,
,
,
,

Abstract

ConVerTest presents a two-stage pipeline for generating reliable tests without ground-truth code, using self-consistency, chain-of-verification, and dual execution agreement to improve test validity and coverage.

AI-generated summary

Large Language Models (LLMs) have significantly advanced automated test generation, yet existing methods often rely on ground-truth code for verification, risking bug propagation and limiting applicability in test-driven development. We present ConVerTest, a novel two-stage pipeline for synthesizing reliable tests without requiring prior code implementations. ConVerTest integrates three core strategies: (i) Self-Consistency(SC) to generate convergent test cases via majority voting; (ii) Chain-of-Verification (CoVe) for iterative, reasoning-guided code refinement; and (iii) a Dual Execution Agreement to crossvalidate code and tests through consensus. Experiments on BIGCODEBENCH and LESS BASIC PYTHON PROBLEMS (LBPP) benchmarks demonstrate that ConVerTest improves test validity, line coverage, and mutation scores by up to 39%, 28%, and 18% respectively over baselines. Our findings highlight ConVerTest as a robust solution for mitigating hallucinations and enhancing the reliability of autonomous software testing agents.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2602.10522
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.10522 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.10522 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.10522 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.