---
title: Data Analyst Agent
emoji: 📊
colorFrom: yellow
colorTo: blue
sdk: gradio
app_file: app.py
pinned: false
license: mit
---

# Data Analyst Agent

## Question

What does an agentic data-analysis loop look like when the generated code is visible?

## System Boundary

This Space analyzes CSV files by generating pandas code, executing it in a constrained namespace, and returning both the result and the code. The transparency is deliberate.

## Method

The app reads the uploaded CSV, summarizes the dataframe schema, sends the user question and schema to an instruction model, extracts executable pandas code, runs it with safeguards, and displays tables or Plotly charts.

## Technique

This is a tool-using agent pattern. The language model does not directly compute the answer; it writes code that a deterministic tool executes.

The useful boundary is between the model and the runtime. The model proposes a program. Python computes the result. The user can inspect both.

## Output

The app returns generated code, execution logs, result tables, and visualizations.

## Why It Matters

Agent demos are often opaque. This one makes the reasoning artifact inspectable: the code. That lets users verify calculations, learn from the workflow, and debug failures.

## What To Notice

Look at the generated pandas before trusting the answer. If the code is wrong, the result is wrong. This is the correct failure mode because it is visible.

## Effect In Practice

Transparent code generation can speed up exploratory analysis while preserving auditability. It is especially useful for teaching, notebooks, and internal analytics tools.

## Hugging Face Extension

The Space can be evaluated with a dataset of CSV files, natural-language questions, expected code patterns, and expected answers.

## Limitations

Generated code should be reviewed. The execution sandbox is intentionally narrow and does not replace a hardened production isolation layer.

## Run Locally

```bash
pip install -r requirements.txt
python app.py
```