A newer version of the Gradio SDK is available:
6.3.0
metadata
title: Pandas CSV Analyzer (LlamaIndex)
emoji: ๐ผ
colorFrom: purple
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: false
license: mit
short_description: An AI agent for CSV analysis with LlamaIndex & Pandas.
sdk_version: 5.44.1
๐ผ Pandas CSV Analyzer (LlamaIndex)
This is an advanced AI agent that allows you to analyze CSV files by asking questions in natural language. Upload a file, ask your questions, and get instant insights without writing a single line of code.
โจ Key Features
- Natural Language Queries: Ask questions like, "Which branch had the highest total revenue?" or "Show the top 5 best-selling products."
- Pandas Code Generation: The AI generates and executes the necessary Python (Pandas) code to answer your query, displaying the code used for full transparency.
- PDF Report Generation: Download your entire conversation history and analysis in a clean and organized PDF report.
- Robust Architecture: Built with the new LlamaIndex Workflows library, ensuring a modular and reliable data processing pipeline.
๐ How to Use
- Upload your CSV File: Use the upload panel on the left to load your data.
- Ask a Question: Type your question about the data in the text box at the bottom.
- Get Insights: The agent will process your request, generate the answer, and display it in the chat.
- Download the Report: When your analysis is complete, click "Generate and Download PDF" to get a full report.
๐ ๏ธ How it Works (Tech Stack)
This project integrates several cutting-edge technologies to create a seamless data analysis experience:
- Interface: Gradio is used to build the interactive web interface.
- Data Manipulation: Pandas is the engine behind all data manipulation and analysis of the CSV file.
- AI Orchestration: LlamaIndex Workflows manages the entire process, from receiving the user's question to synthesizing the final answer. This is a modern replacement for the older
QueryPipelines. - Language Model (LLM): The Groq API provides access to high-speed language models (like Llama 3) to generate Pandas code and synthesize human-readable answers.
- PDF Generation: The FPDF2 library is used to create reports from the conversation history.
The workflow is as follows:
- The user submits a question.
- The LlamaIndex
PandasWorkflowis initiated. - The LLM receives the question, data schema, and examples, then generates a Pandas expression.
- This expression is safely executed in the backend.
- The result is sent back to the LLM, which generates a final, natural-language response for the user.
Developed by Yuri Arduino Bernardineli Alves
- GitHub: YuriArduino
- Email: yuriarduino@gmail.com