ArduinoYuri's picture
Update README.md
153e311 verified

A newer version of the Gradio SDK is available: 6.3.0

Upgrade
metadata
title: Pandas CSV Analyzer (LlamaIndex)
emoji: ๐Ÿผ
colorFrom: purple
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: false
license: mit
short_description: An AI agent for CSV analysis with LlamaIndex & Pandas.
sdk_version: 5.44.1

๐Ÿผ Pandas CSV Analyzer (LlamaIndex)

This is an advanced AI agent that allows you to analyze CSV files by asking questions in natural language. Upload a file, ask your questions, and get instant insights without writing a single line of code.

โœจ Key Features

  • Natural Language Queries: Ask questions like, "Which branch had the highest total revenue?" or "Show the top 5 best-selling products."
  • Pandas Code Generation: The AI generates and executes the necessary Python (Pandas) code to answer your query, displaying the code used for full transparency.
  • PDF Report Generation: Download your entire conversation history and analysis in a clean and organized PDF report.
  • Robust Architecture: Built with the new LlamaIndex Workflows library, ensuring a modular and reliable data processing pipeline.

๐Ÿš€ How to Use

  1. Upload your CSV File: Use the upload panel on the left to load your data.
  2. Ask a Question: Type your question about the data in the text box at the bottom.
  3. Get Insights: The agent will process your request, generate the answer, and display it in the chat.
  4. Download the Report: When your analysis is complete, click "Generate and Download PDF" to get a full report.

๐Ÿ› ๏ธ How it Works (Tech Stack)

This project integrates several cutting-edge technologies to create a seamless data analysis experience:

  • Interface: Gradio is used to build the interactive web interface.
  • Data Manipulation: Pandas is the engine behind all data manipulation and analysis of the CSV file.
  • AI Orchestration: LlamaIndex Workflows manages the entire process, from receiving the user's question to synthesizing the final answer. This is a modern replacement for the older QueryPipelines.
  • Language Model (LLM): The Groq API provides access to high-speed language models (like Llama 3) to generate Pandas code and synthesize human-readable answers.
  • PDF Generation: The FPDF2 library is used to create reports from the conversation history.

The workflow is as follows:

  1. The user submits a question.
  2. The LlamaIndex PandasWorkflow is initiated.
  3. The LLM receives the question, data schema, and examples, then generates a Pandas expression.
  4. This expression is safely executed in the backend.
  5. The result is sent back to the LLM, which generates a final, natural-language response for the user.

Developed by Yuri Arduino Bernardineli Alves