---
title: Data Analyst MCP
emoji: 🚀
colorFrom: green
colorTo: purple
sdk: gradio
sdk_version: 6.12.0
app_file: app.py
pinned: false
short_description: Analyze survey data and generate stats and charts from SQL
---

# Data Analyst agent with MCP

A data analysis assistant focused on the `surveyData` table, with automatic chart generation.

## What this app does

- Answers natural-language questions about survey data.
- Runs data queries through MCP tools.
- Generates charts (`bar`, `pie`, `line`) when useful.
- Displays results in a Gradio UI that works well on Hugging Face Spaces.

## Tech stack

- Python 3.11+
- [Gradio 6.12.0](https://www.gradio.app/)
- `smolagents` + MCP client (SSE transport)
- Azure OpenAI (configurable model)
- `matplotlib` + `seaborn` for charts

## Project structure

```text
.
├── app.py            # Gradio UI + agent orchestration
├── tools.py          # Local tool: generate_chart(...)
├── prompt.py         # Agent system instructions
├── charts/           # Generated PNG files
├── requirements.txt
└── README.md
```

## Prerequisites

- An MCP endpoint reachable over SSE.
- An Azure OpenAI deployment.
- A `.env` file (currently loaded from the parent directory of this project).

## Environment variables

Used by `app.py`:

- `MCP_SERVER_URL` (example: `https://.../gradio_api/mcp/sse`)
- `AZURE_OPENAI_ENDPOINT`
- `AZURE_OPENAI_API_KEY`
- `AZURE_OPENAI_API_VERSION` (default: `2024-12-01-preview`)
- `AZURE_OPENAI_MODEL` (default: `gpt-4o`)
- `PORT` (default: `7860`)

Minimal `.env` example:

```env
MCP_SERVER_URL=https://your-mcp-server/gradio_api/mcp/sse
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=YOUR_KEY
AZURE_OPENAI_API_VERSION=2024-12-01-preview
AZURE_OPENAI_MODEL=gpt-4o
```

## Run locally

```bash
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python app.py
```

Then open `http://localhost:7860`.

## Deploy to Hugging Face Spaces

This repository is already configured as a Gradio Space through the front-matter at the top of this README.

Simple workflow:

1. Make your code changes.
2. Commit on `main`.
3. Push to the Space remote:

```bash
git push origin main
```

The Space rebuilds automatically.

## Runtime flow (high level)

1. User asks a question.
2. `ToolCallingAgent` calls MCP tools (and `generate_chart` when relevant).
3. If a chart is generated, it is encoded in base64 and rendered inline.
4. The assistant answer is shown in chat.

## Prompting / important data rule

In `prompt.py`, one strict rule is enforced:

- the SQL table is always **`"surveyData"`** (capital `D`)

If your SQL backend changes, update this part first.

## Useful example questions

- `How many questions do we have?`
- `Sub-options per question`
- `Distribution by question type`
- `Top 5 questions by number of sub options`
- `Percentage distribution of questions`

## Troubleshooting

### 1) `SSE Stream ended` / `httpx.ConnectError`

Common cause: MCP endpoint is unreachable (wrong URL, sleeping service, network issue).

Check:

- `MCP_SERVER_URL` is correct
- endpoint is reachable from the deployment environment
- MCP server is actually running

### 2) Gradio startup errors

Examples already encountered:

- unsupported `type` argument on `gr.Chatbot`
- unsupported `bubble_full_width` argument

If this happens again, confirm the Gradio version in `requirements.txt` and align component parameters accordingly.

### 3) UI page growing infinitely on Spaces

The app now uses a native Gradio layout + minimal CSS to avoid iframe-related layout issues.

### 4) Missing Azure key

Without `AZURE_OPENAI_API_KEY`, the assistant cannot answer.
Set it in Space Settings > Secrets.

## Maintenance notes

- Generated charts are stored in `charts/`.
- `generate_chart` saves PNG files with a title-derived filename.
- If needed, clean old files in `charts/` periodically.

## Helpful reference

- Spaces config docs: https://huggingface.co/docs/hub/spaces-config-reference