File size: 1,577 Bytes
eb0c103
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
---

title: Bertopic Gradio
emoji: ๐Ÿ“Š
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 4.16.0
app_file: app.py
pinned: false
---


# BERTopic Topic Modeling Gradio App

A user-friendly web application for topic modeling using BERTopic with Hugging Face embeddings. Upload a text file and visualize discovered topics with an interactive intertopic distance map.

## Features

- **File Upload**: Upload any text file (.txt) for analysis
- **Automatic Document Detection**: Intelligently splits text by paragraphs or lines
- **Hugging Face Embeddings**: Uses sentence-transformers for high-quality embeddings
- **Interactive Visualization**: Explore topics with a Plotly-based intertopic distance map
- **Topic Explorer**: Get detailed information about specific topics
- **Customizable Parameters**: Fine-tune UMAP and HDBSCAN settings

## Usage

### 1. Prepare Your Data

Create a text file where:
- Each **line** or **paragraph** is treated as a separate document
- Documents should have at least 3 words each
- For best results, provide 20-50+ documents with varied content

### 2. Upload and Process

1. Click "Upload Text File" and select your .txt file
2. Adjust advanced parameters if needed (or use defaults)
3. Click "๐Ÿš€ Run Topic Modeling"

### 3. Explore Results

- **Intertopic Distance Map**: Interactive visualization showing topic clusters
- **Topic Table**: Shows topic IDs, document counts, and top keywords
- **Topic Explorer**: Enter a topic ID to see detailed keyword weights and representative documents