Papers
arxiv:2605.29271

CoHyDE: Iterative Co-Training of LLM Rewriter & Dense Encoder for Tool Retrieval

Published on May 28
· Submitted by
Ashutosh Hathidara
on May 29
Authors:
,
,

Abstract

CoHyDE is an iterative method that jointly trains a dense encoder and LLM rewriter to improve tool retrieval from API catalogs, achieving better performance on both specific and vague queries through co-evolutionary training.

AI-generated summary

Tool retrieval over large API catalogs is a core bottleneck for LLM agents: user queries arrive in colloquial, often underspecified language, while the catalog uses technical API vocabulary that no fixed encoder can bridge on its own. The two dominant training approaches, contrastive encoder fine-tuning and HyDE-style query expansion with a frozen LLM, address this problem from opposite ends and fail in complementary directions: the fine-tuned encoder excels when the query's surface form already matches the catalog but collapses when it does not, while zero-shot HyDE is more robust to underspecified queries yet generates catalog-unaware hypothetical descriptions that degrade retrieval when queries are well-formed. We introduce CoHyDE, an iterative procedure that trains the dense encoder and the LLM rewriter as a single co-evolving system: the encoder is retrained with InfoNCE on catalog-style hypothetical descriptions produced by the rewriter, and the rewriter is preference-aligned via DPO against the encoder's retrieval scores, with both sides warm-started on the tool catalog before the loop begins. On a ~10k tool subset of the ToolBench catalog, three rounds of CoHyDE improve over the strongest single-component baseline by +2.5 pp NDCG@5 on standard queries and +6.3 pp on held-out vague queries, with gains as large as +8 pp on the hardest vague tier. Ablations confirm that co-training is the key ingredient: using either component in isolation fails to match CoHyDE on both well-formed and vague queries, with losses of up to -8 pp on vague queries.

Community

CoHyDE introduces an iterative co-training procedure that jointly optimizes a dense encoder and an LLM rewriter for tool retrieval. By alternating between retraining the encoder on the rewriter’s catalog-aligned hypothetical descriptions and aligning the rewriter via DPO against the encoder’s retrieval scores, CoHyDE closes the vocabulary gap between colloquial queries and technical APIs. On a 10k-tool subset of ToolBench, it achieves gains of +2.5 pp NDCG@5 on standard queries and +6.3 pp on vague queries, with improvements up to +8 pp on the hardest vague tier.

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.29271
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.29271 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.29271 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.29271 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.