lona / README.md
mrradix's picture
Update README.md
75627a3 verified

A newer version of the Gradio SDK is available: 6.13.0

Upgrade
metadata
title: AI Data Science Assistant
emoji: πŸ€–
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.33.2
app_file: app.py
pinned: false

πŸ€– AI Data Science Assistant

A powerful AI assistant for data analysis, document Q&A, and general conversation - completely free and open-source!

Hugging Face Spaces License: MIT Python

🌟 Features

πŸ“Š CSV Data Analysis

  • Upload & Analyze: Drop your CSV files for instant analysis
  • Smart Visualizations: Auto-generates 4 types of charts:
    • Data types distribution (pie chart)
    • Missing values analysis (bar chart)
    • Numeric distributions (histograms)
    • Correlation matrix (heatmap)
  • Statistical Summary: Complete descriptive statistics for numeric columns
  • Data Preview: Shows dataset shape, columns, and first 5 rows

πŸ“š PDF Document Q&A (RAG)

  • Upload PDFs: Process any PDF document for question answering
  • Smart Retrieval: Uses RAG (Retrieval Augmented Generation) with FAISS vector database
  • Context-Aware: Answers questions based on actual document content
  • Source Attribution: References the source document in responses
  • Persistent Memory: Remembers uploaded documents during your session

πŸ’¬ General AI Chat

  • Claude-Style Responses: Clear, helpful, and honest communication
  • No Hallucination: Says "I don't know" when uncertain
  • Multi-Turn Conversations: Maintains conversation context
  • Wide Knowledge: Helps with coding, explanations, creative tasks, and more

πŸš€ Try It Now

Live Demo: https://huggingface.co/spaces/mrradix/ai-ds-assistant

🎯 Use Cases

For Data Scientists & Analysts

  • Quick CSV exploration and visualization
  • Statistical analysis without writing code
  • Data quality assessment (missing values, distributions)
  • Correlation analysis between variables

For Researchers & Students

  • PDF document analysis and Q&A
  • Extract insights from research papers
  • Ask questions about uploaded documents
  • Get explanations of complex concepts

For General Users

  • AI-powered conversations
  • Help with various tasks and questions
  • Document analysis and summarization
  • Data visualization assistance

πŸ› οΈ Technical Stack

Models (100% Open Source)

  • Language Model: google/flan-t5-base - Google's instruction-tuned T5
  • Embeddings: sentence-transformers/all-MiniLM-L6-v2 - For document retrieval
  • Vector Store: FAISS - Efficient similarity search

Libraries & Frameworks

  • Frontend: Gradio - Interactive web interface
  • Data Processing: Pandas, NumPy - Data manipulation
  • Visualization: Matplotlib - Chart generation
  • AI/ML: Transformers, PyTorch - Model inference
  • RAG Pipeline: LangChain - Document processing and Q&A
  • PDF Processing: PDFMiner - Text extraction

πŸ“– How to Use

1. CSV Analysis

  1. Go to the "πŸ“Š CSV Analysis" tab
  2. Upload your CSV file using the file uploader
  3. Click "πŸ“ˆ Analyze CSV"
  4. View the automatic analysis, statistics, and charts

2. PDF Q&A

  1. Switch to the "πŸ“š PDF Q&A" tab
  2. Upload a PDF document
  3. Click "πŸ“€ Process PDF" and wait for confirmation
  4. Ask questions about the document content
  5. Get AI-powered answers based on the document

3. General Chat

  1. Visit the "πŸ’¬ General Chat" tab
  2. Type your question or message
  3. Get helpful, Claude-style responses
  4. Continue the conversation naturally

🎨 Screenshots

CSV Analysis Interface

The CSV tab provides instant data insights with professional visualizations and statistical summaries.

PDF Q&A Interface

Upload any PDF and ask questions - the AI will answer based on the actual document content.

General Chat Interface

Natural conversation interface for any topic or question.

πŸ”§ Local Installation

Want to run this locally? Here's how:

# Clone the repository
git clone https://huggingface.co/spaces/mrradix/ai-ds-assistant
cd ai-ds-assistant

# Install dependencies
pip install -r requirements.txt

# Run the application
python app.py

Requirements

  • Python 3.8+
  • 4GB+ RAM recommended
  • GPU optional (will use CPU if not available)

🌟 Key Advantages

βœ… Completely Free

  • No API keys required
  • No usage limits
  • Open-source models only

βœ… Privacy-First

  • All processing happens locally (in the Space)
  • No data sent to external APIs
  • Your documents stay private

βœ… Production-Ready

  • Robust error handling
  • Professional UI/UX
  • Mobile-responsive design

βœ… Educational

  • Learn about RAG systems
  • Understand AI model deployment
  • Explore data science workflows

🀝 Contributing

This is an open-source project! Contributions are welcome:

  1. Report Issues: Found a bug? Open an issue
  2. Feature Requests: Have an idea? Let's discuss it
  3. Pull Requests: Improvements and fixes are appreciated
  4. Documentation: Help improve the docs

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Google: For the Flan-T5 model
  • Hugging Face: For the amazing model hub and Spaces platform
  • LangChain: For the RAG framework
  • Gradio: For the intuitive interface framework
  • Open Source Community: For all the incredible libraries used

πŸ“§ Contact & Support


⭐ If you find this helpful, please give it a star and share with others!

Built with ❀️ using open-source AI models