Spaces:

ohmp
/

EDA-Generator

Runtime error

File size: 1,696 Bytes

---
title: EDA-Generator
emoji: 🚀
colorFrom: purple
colorTo: indigo
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: false
license: apache-2.0
---

# 🚀 EDA-Generator

AI-Powered Chatbot & Dataset Exploratory Analysis on HuggingFace Spaces.

## Features

### 💬 AI Chatbot
- Powered by **GPT-OSS-20B** (OpenAI's open-weight model)
- 21B parameters with Mixture-of-Experts architecture
- Streaming responses for real-time interaction
- Built with HuggingFace Transformers & Chat Templates

### 📊 Dataset EDA
- Analyzes the **fka/awesome-chatgpt-prompts** dataset
- 1,040 curated prompts for various AI personas
- Interactive Plotly visualizations:
  - Prompt length distribution
  - Word count analysis
  - Top words frequency
  - Length vs words correlation
  - Category distribution

## Local Setup

```bash
# Clone the repository
git clone https://huggingface.co/spaces/<your-username>/EDA-Generator

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set your HuggingFace token
export HF_TOKEN="your_token_here"
# Or create a .env file with: HF_TOKEN=your_token_here

# Run the app
python app.py
```

## Tech Stack

- **Frontend**: Gradio 4.x
- **ML**: HuggingFace Transformers
- **Model**: openai/gpt-oss-20b
- **Visualization**: Plotly, Matplotlib, Seaborn
- **Data**: HuggingFace Datasets

## Files

| File | Description |
|------|-------------|
| `app.py` | Main Gradio application |
| `chatbot.py` | Chatbot logic with transformers |
| `eda.py` | EDA functions & visualizations |
| `requirements.txt` | Python dependencies |

## License

Apache 2.0