bertopic_app / README.md
Jomaric's picture
Upload 4 files
eb0c103 verified
---
title: Bertopic Gradio
emoji: πŸ“Š
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 4.16.0
app_file: app.py
pinned: false
---
# BERTopic Topic Modeling Gradio App
A user-friendly web application for topic modeling using BERTopic with Hugging Face embeddings. Upload a text file and visualize discovered topics with an interactive intertopic distance map.
## Features
- **File Upload**: Upload any text file (.txt) for analysis
- **Automatic Document Detection**: Intelligently splits text by paragraphs or lines
- **Hugging Face Embeddings**: Uses sentence-transformers for high-quality embeddings
- **Interactive Visualization**: Explore topics with a Plotly-based intertopic distance map
- **Topic Explorer**: Get detailed information about specific topics
- **Customizable Parameters**: Fine-tune UMAP and HDBSCAN settings
## Usage
### 1. Prepare Your Data
Create a text file where:
- Each **line** or **paragraph** is treated as a separate document
- Documents should have at least 3 words each
- For best results, provide 20-50+ documents with varied content
### 2. Upload and Process
1. Click "Upload Text File" and select your .txt file
2. Adjust advanced parameters if needed (or use defaults)
3. Click "πŸš€ Run Topic Modeling"
### 3. Explore Results
- **Intertopic Distance Map**: Interactive visualization showing topic clusters
- **Topic Table**: Shows topic IDs, document counts, and top keywords
- **Topic Explorer**: Enter a topic ID to see detailed keyword weights and representative documents