data-analyst-mcp / README.md
sekpona kokou
Rewrite README in English
e2fccf2

A newer version of the Gradio SDK is available: 6.15.2

Upgrade
metadata
title: Data Analyst MCP
emoji: πŸš€
colorFrom: green
colorTo: purple
sdk: gradio
sdk_version: 6.12.0
app_file: app.py
pinned: false
short_description: Analyze survey data and generate stats and charts from SQL

Data Analyst agent with MCP

A data analysis assistant focused on the surveyData table, with automatic chart generation.

What this app does

  • Answers natural-language questions about survey data.
  • Runs data queries through MCP tools.
  • Generates charts (bar, pie, line) when useful.
  • Displays results in a Gradio UI that works well on Hugging Face Spaces.

Tech stack

  • Python 3.11+
  • Gradio 6.12.0
  • smolagents + MCP client (SSE transport)
  • Azure OpenAI (configurable model)
  • matplotlib + seaborn for charts

Project structure

.
β”œβ”€β”€ app.py            # Gradio UI + agent orchestration
β”œβ”€β”€ tools.py          # Local tool: generate_chart(...)
β”œβ”€β”€ prompt.py         # Agent system instructions
β”œβ”€β”€ charts/           # Generated PNG files
β”œβ”€β”€ requirements.txt
└── README.md

Prerequisites

  • An MCP endpoint reachable over SSE.
  • An Azure OpenAI deployment.
  • A .env file (currently loaded from the parent directory of this project).

Environment variables

Used by app.py:

  • MCP_SERVER_URL (example: https://.../gradio_api/mcp/sse)
  • AZURE_OPENAI_ENDPOINT
  • AZURE_OPENAI_API_KEY
  • AZURE_OPENAI_API_VERSION (default: 2024-12-01-preview)
  • AZURE_OPENAI_MODEL (default: gpt-4o)
  • PORT (default: 7860)

Minimal .env example:

MCP_SERVER_URL=https://your-mcp-server/gradio_api/mcp/sse
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=YOUR_KEY
AZURE_OPENAI_API_VERSION=2024-12-01-preview
AZURE_OPENAI_MODEL=gpt-4o

Run locally

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python app.py

Then open http://localhost:7860.

Deploy to Hugging Face Spaces

This repository is already configured as a Gradio Space through the front-matter at the top of this README.

Simple workflow:

  1. Make your code changes.
  2. Commit on main.
  3. Push to the Space remote:
git push origin main

The Space rebuilds automatically.

Runtime flow (high level)

  1. User asks a question.
  2. ToolCallingAgent calls MCP tools (and generate_chart when relevant).
  3. If a chart is generated, it is encoded in base64 and rendered inline.
  4. The assistant answer is shown in chat.

Prompting / important data rule

In prompt.py, one strict rule is enforced:

  • the SQL table is always "surveyData" (capital D)

If your SQL backend changes, update this part first.

Useful example questions

  • How many questions do we have?
  • Sub-options per question
  • Distribution by question type
  • Top 5 questions by number of sub options
  • Percentage distribution of questions

Troubleshooting

1) SSE Stream ended / httpx.ConnectError

Common cause: MCP endpoint is unreachable (wrong URL, sleeping service, network issue).

Check:

  • MCP_SERVER_URL is correct
  • endpoint is reachable from the deployment environment
  • MCP server is actually running

2) Gradio startup errors

Examples already encountered:

  • unsupported type argument on gr.Chatbot
  • unsupported bubble_full_width argument

If this happens again, confirm the Gradio version in requirements.txt and align component parameters accordingly.

3) UI page growing infinitely on Spaces

The app now uses a native Gradio layout + minimal CSS to avoid iframe-related layout issues.

4) Missing Azure key

Without AZURE_OPENAI_API_KEY, the assistant cannot answer. Set it in Space Settings > Secrets.

Maintenance notes

  • Generated charts are stored in charts/.
  • generate_chart saves PNG files with a title-derived filename.
  • If needed, clean old files in charts/ periodically.

Helpful reference