data-analyst-mcp / README.md
sekpona kokou
Rewrite README in English
e2fccf2
---
title: Data Analyst MCP
emoji: πŸš€
colorFrom: green
colorTo: purple
sdk: gradio
sdk_version: 6.12.0
app_file: app.py
pinned: false
short_description: Analyze survey data and generate stats and charts from SQL
---
# Data Analyst agent with MCP
A data analysis assistant focused on the `surveyData` table, with automatic chart generation.
## What this app does
- Answers natural-language questions about survey data.
- Runs data queries through MCP tools.
- Generates charts (`bar`, `pie`, `line`) when useful.
- Displays results in a Gradio UI that works well on Hugging Face Spaces.
## Tech stack
- Python 3.11+
- [Gradio 6.12.0](https://www.gradio.app/)
- `smolagents` + MCP client (SSE transport)
- Azure OpenAI (configurable model)
- `matplotlib` + `seaborn` for charts
## Project structure
```text
.
β”œβ”€β”€ app.py # Gradio UI + agent orchestration
β”œβ”€β”€ tools.py # Local tool: generate_chart(...)
β”œβ”€β”€ prompt.py # Agent system instructions
β”œβ”€β”€ charts/ # Generated PNG files
β”œβ”€β”€ requirements.txt
└── README.md
```
## Prerequisites
- An MCP endpoint reachable over SSE.
- An Azure OpenAI deployment.
- A `.env` file (currently loaded from the parent directory of this project).
## Environment variables
Used by `app.py`:
- `MCP_SERVER_URL` (example: `https://.../gradio_api/mcp/sse`)
- `AZURE_OPENAI_ENDPOINT`
- `AZURE_OPENAI_API_KEY`
- `AZURE_OPENAI_API_VERSION` (default: `2024-12-01-preview`)
- `AZURE_OPENAI_MODEL` (default: `gpt-4o`)
- `PORT` (default: `7860`)
Minimal `.env` example:
```env
MCP_SERVER_URL=https://your-mcp-server/gradio_api/mcp/sse
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=YOUR_KEY
AZURE_OPENAI_API_VERSION=2024-12-01-preview
AZURE_OPENAI_MODEL=gpt-4o
```
## Run locally
```bash
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python app.py
```
Then open `http://localhost:7860`.
## Deploy to Hugging Face Spaces
This repository is already configured as a Gradio Space through the front-matter at the top of this README.
Simple workflow:
1. Make your code changes.
2. Commit on `main`.
3. Push to the Space remote:
```bash
git push origin main
```
The Space rebuilds automatically.
## Runtime flow (high level)
1. User asks a question.
2. `ToolCallingAgent` calls MCP tools (and `generate_chart` when relevant).
3. If a chart is generated, it is encoded in base64 and rendered inline.
4. The assistant answer is shown in chat.
## Prompting / important data rule
In `prompt.py`, one strict rule is enforced:
- the SQL table is always **`"surveyData"`** (capital `D`)
If your SQL backend changes, update this part first.
## Useful example questions
- `How many questions do we have?`
- `Sub-options per question`
- `Distribution by question type`
- `Top 5 questions by number of sub options`
- `Percentage distribution of questions`
## Troubleshooting
### 1) `SSE Stream ended` / `httpx.ConnectError`
Common cause: MCP endpoint is unreachable (wrong URL, sleeping service, network issue).
Check:
- `MCP_SERVER_URL` is correct
- endpoint is reachable from the deployment environment
- MCP server is actually running
### 2) Gradio startup errors
Examples already encountered:
- unsupported `type` argument on `gr.Chatbot`
- unsupported `bubble_full_width` argument
If this happens again, confirm the Gradio version in `requirements.txt` and align component parameters accordingly.
### 3) UI page growing infinitely on Spaces
The app now uses a native Gradio layout + minimal CSS to avoid iframe-related layout issues.
### 4) Missing Azure key
Without `AZURE_OPENAI_API_KEY`, the assistant cannot answer.
Set it in Space Settings > Secrets.
## Maintenance notes
- Generated charts are stored in `charts/`.
- `generate_chart` saves PNG files with a title-derived filename.
- If needed, clean old files in `charts/` periodically.
## Helpful reference
- Spaces config docs: https://huggingface.co/docs/hub/spaces-config-reference