--- title: Data Analyst MCP emoji: 🚀 colorFrom: green colorTo: purple sdk: gradio sdk_version: 6.12.0 app_file: app.py pinned: false short_description: Analyze survey data and generate stats and charts from SQL --- # Data Analyst agent with MCP A data analysis assistant focused on the `surveyData` table, with automatic chart generation. ## What this app does - Answers natural-language questions about survey data. - Runs data queries through MCP tools. - Generates charts (`bar`, `pie`, `line`) when useful. - Displays results in a Gradio UI that works well on Hugging Face Spaces. ## Tech stack - Python 3.11+ - [Gradio 6.12.0](https://www.gradio.app/) - `smolagents` + MCP client (SSE transport) - Azure OpenAI (configurable model) - `matplotlib` + `seaborn` for charts ## Project structure ```text . ├── app.py # Gradio UI + agent orchestration ├── tools.py # Local tool: generate_chart(...) ├── prompt.py # Agent system instructions ├── charts/ # Generated PNG files ├── requirements.txt └── README.md ``` ## Prerequisites - An MCP endpoint reachable over SSE. - An Azure OpenAI deployment. - A `.env` file (currently loaded from the parent directory of this project). ## Environment variables Used by `app.py`: - `MCP_SERVER_URL` (example: `https://.../gradio_api/mcp/sse`) - `AZURE_OPENAI_ENDPOINT` - `AZURE_OPENAI_API_KEY` - `AZURE_OPENAI_API_VERSION` (default: `2024-12-01-preview`) - `AZURE_OPENAI_MODEL` (default: `gpt-4o`) - `PORT` (default: `7860`) Minimal `.env` example: ```env MCP_SERVER_URL=https://your-mcp-server/gradio_api/mcp/sse AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/ AZURE_OPENAI_API_KEY=YOUR_KEY AZURE_OPENAI_API_VERSION=2024-12-01-preview AZURE_OPENAI_MODEL=gpt-4o ``` ## Run locally ```bash python3 -m venv venv source venv/bin/activate pip install -r requirements.txt python app.py ``` Then open `http://localhost:7860`. ## Deploy to Hugging Face Spaces This repository is already configured as a Gradio Space through the front-matter at the top of this README. Simple workflow: 1. Make your code changes. 2. Commit on `main`. 3. Push to the Space remote: ```bash git push origin main ``` The Space rebuilds automatically. ## Runtime flow (high level) 1. User asks a question. 2. `ToolCallingAgent` calls MCP tools (and `generate_chart` when relevant). 3. If a chart is generated, it is encoded in base64 and rendered inline. 4. The assistant answer is shown in chat. ## Prompting / important data rule In `prompt.py`, one strict rule is enforced: - the SQL table is always **`"surveyData"`** (capital `D`) If your SQL backend changes, update this part first. ## Useful example questions - `How many questions do we have?` - `Sub-options per question` - `Distribution by question type` - `Top 5 questions by number of sub options` - `Percentage distribution of questions` ## Troubleshooting ### 1) `SSE Stream ended` / `httpx.ConnectError` Common cause: MCP endpoint is unreachable (wrong URL, sleeping service, network issue). Check: - `MCP_SERVER_URL` is correct - endpoint is reachable from the deployment environment - MCP server is actually running ### 2) Gradio startup errors Examples already encountered: - unsupported `type` argument on `gr.Chatbot` - unsupported `bubble_full_width` argument If this happens again, confirm the Gradio version in `requirements.txt` and align component parameters accordingly. ### 3) UI page growing infinitely on Spaces The app now uses a native Gradio layout + minimal CSS to avoid iframe-related layout issues. ### 4) Missing Azure key Without `AZURE_OPENAI_API_KEY`, the assistant cannot answer. Set it in Space Settings > Secrets. ## Maintenance notes - Generated charts are stored in `charts/`. - `generate_chart` saves PNG files with a title-derived filename. - If needed, clean old files in `charts/` periodically. ## Helpful reference - Spaces config docs: https://huggingface.co/docs/hub/spaces-config-reference