arxiv:2603.09290

ToolRosetta: Bridging Open-Source Repositories and Large Language Model Agents through Automated Tool Standardization

Published on Mar 10

· Submitted by

Leo Y on Mar 24

Authors:

Abstract

Reusing and invoking existing code remains costly and unreliable, as most practical tools are embedded in heterogeneous code repositories and lack standardized, executable interfaces. Although large language models (LLMs) and Model Context Protocol (MCP)-based tool invocation frameworks enable natural language task execution, current approaches rely heavily on manual tool curation and standardization, which fundamentally limits scalability. In this paper, we propose ToolRosetta, a unified framework that automatically translates open-source code repositories and APIs into MCP-compatible tools that can be reliably invoked by LLMs. Given a user task, ToolRosetta autonomously plans toolchains, identifies relevant codebases, and converts them into executable MCP services, enabling end-to-end task completion with minimal human intervention. In addition, ToolRosetta incorporates a security inspection layer to mitigate risks inherent in executing arbitrary code. Extensive experiments across diverse scientific domains demonstrate that ToolRosetta can automatically standardize a large number of open-source tools and reduce the human effort required for code reproduction and deployment. Notably, by seamlessly leveraging specialized open-source tools, ToolRosetta-powered agents consistently improve task completion performance compared to commercial LLMs and existing agent systems.

View arXiv page View PDF Project page Add to collection

Community

LeoYML

Paper submitter about 9 hours ago

ToolRosetta introduces an automated framework that transforms open-source repositories and APIs into MCP-compatible tools that LLM agents can discover, standardize, validate, and invoke with minimal human effort. Instead of relying on fixed, manually curated tool libraries, the system combines repository search, automated MCP construction, security inspection, and iterative repair to support end-to-end tool use across diverse scientific domains.

Across 122 GitHub repositories, ToolRosetta standardizes 1,580 tools spanning six domains. It achieves a 53.0% first-pass conversion success rate, improves to 68.4% after iterative repair, and reduces average conversion time to 210.1 seconds per repository compared with 1,589.4 seconds for human engineers. On downstream tasks, it reaches 55.6% macro-average accuracy across six scientific domains and shows especially strong performance on out-of-distribution problems that require long-tail open-source tools.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.09290 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.09290 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.09290 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.