File size: 1,696 Bytes
3c0595a
043c791
 
 
 
3c0595a
043c791
3c0595a
 
043c791
3c0595a
 
043c791
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
title: EDA-Generator
emoji: πŸš€
colorFrom: purple
colorTo: indigo
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: false
license: apache-2.0
---

# πŸš€ EDA-Generator

AI-Powered Chatbot & Dataset Exploratory Analysis on HuggingFace Spaces.

## Features

### πŸ’¬ AI Chatbot
- Powered by **GPT-OSS-20B** (OpenAI's open-weight model)
- 21B parameters with Mixture-of-Experts architecture
- Streaming responses for real-time interaction
- Built with HuggingFace Transformers & Chat Templates

### πŸ“Š Dataset EDA
- Analyzes the **fka/awesome-chatgpt-prompts** dataset
- 1,040 curated prompts for various AI personas
- Interactive Plotly visualizations:
  - Prompt length distribution
  - Word count analysis
  - Top words frequency
  - Length vs words correlation
  - Category distribution

## Local Setup

```bash
# Clone the repository
git clone https://huggingface.co/spaces/<your-username>/EDA-Generator

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set your HuggingFace token
export HF_TOKEN="your_token_here"
# Or create a .env file with: HF_TOKEN=your_token_here

# Run the app
python app.py
```

## Tech Stack

- **Frontend**: Gradio 4.x
- **ML**: HuggingFace Transformers
- **Model**: openai/gpt-oss-20b
- **Visualization**: Plotly, Matplotlib, Seaborn
- **Data**: HuggingFace Datasets

## Files

| File | Description |
|------|-------------|
| `app.py` | Main Gradio application |
| `chatbot.py` | Chatbot logic with transformers |
| `eda.py` | EDA functions & visualizations |
| `requirements.txt` | Python dependencies |

## License

Apache 2.0