image2drawio

Runtime error

App Files Files Community

image2drawio / README.md

Giustino98

fixed readme

9271fec 10 months ago

preview code

raw

history blame contribute delete

3.56 kB

A newer version of the Gradio SDK is available: 6.12.0

Upgrade

metadata

title: Image2drawio
emoji: 🐢
colorFrom: gray
colorTo: blue
sdk: gradio
sdk_version: 5.33.1
app_file: app.py
pinned: false
license: mit
short_description: A multi-agent application that converts images into drawio
tags:
  - agent-demo-track
  - mcp-server-track

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Image to Draw.io Converter

A multi-agent application that converts images into editable Draw.io diagrams using advanced LLM-based object detection and diagram generation. The system is built with LangGraph and features a modern Gradio web interface, which is also exposed as an MCP server for integration with external clients.

Examples outcome added in "example-outcome.png"

Features

Multi-Agent Pipeline (LangGraph):
- Supervisor Agent: Orchestrates the workflow and coordinates the agents.
- Object Detection Agent (React): Uses an LLM to detect and extract objects from the uploaded image.
- Draw.io Generator Agent (React): Converts detected objects into Draw.io XML diagrams.
Automated Workflow:
1. Upload an image via the Gradio interface.
2. The object detection agent identifies and extracts key objects.
3. The diagram generator agent creates a Draw.io XML diagram.
4. Download the .drawio file or preview the SVG directly in the browser.
Modern Gradio UI:
- Simple drag-and-drop image upload.
- Real-time status updates and diagram preview.
- Downloadable Draw.io file and copyable XML.
- SVG preview of the generated diagram.
- Exposed as an MCP server for programmatic access.

Requirements

Python 3.10+
Install dependencies:
```
pip install -r requirements.txt
```
API keys for your LLM provider (e.g., Google Gemini).

Configure your .env file:

GEMINI_API_KEY=your_gemini_api_key
GEMINI_MODEL_NAME=gemini-2.5-pro-preview-06-05
GEMINI_THINKING_BUDGET=128

Usage

Install dependencies:
```
pip install -r requirements.txt
```
Configure environment:
- Create a .env file in the project root with your API keys and model settings.
Start the application:
```
python app.py
```
Access the web interface:
- Open the provided local URL in your browser.
- Upload an image (diagram, sketch, chart, etc.).
- Click "Generate Diagram".
- Download the .drawio file or copy the XML.
- Open the file in diagrams.net for further editing.

MCP Server Integration

The Gradio app is also exposed as an MCP server (mcp_server=True), allowing integration with any MCP-compatible client or workflow. This enables automated or remote usage in larger pipelines.

Project Structure

app.py — Main entry point, Gradio UI, and workflow orchestration.
nodes/ — Agent and LangGraph node definitions.
tools/ — LLM tools for object detection and Draw.io generation.
output_llm/ — Generated Draw.io files, SVG previews, and logs.
files/ — Uploaded user images.

Notes

The LLM model and thinking budget are fully configurable via .env for maximum flexibility.
The application is designed for easy extension with new agents or tools.
Performances can vary a lot based on the LLM: in my experience, Gemini 2.5 pro performed much better than Gemini 2.5 flash.

License

MIT License

Author: Giustino Esposito