|
|
|
|
|
--- |
|
|
tags: |
|
|
- bertopic |
|
|
library_name: bertopic |
|
|
pipeline_tag: text-classification |
|
|
--- |
|
|
|
|
|
# topic_model_bert_topic |
|
|
|
|
|
This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. |
|
|
BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. |
|
|
|
|
|
## Usage |
|
|
|
|
|
To use this model, please install BERTopic: |
|
|
|
|
|
``` |
|
|
pip install -U bertopic |
|
|
``` |
|
|
|
|
|
You can use the model as follows: |
|
|
|
|
|
```python |
|
|
from bertopic import BERTopic |
|
|
topic_model = BERTopic.load("VegetaSama/topic_model_bert_topic") |
|
|
|
|
|
topic_model.get_topic_info() |
|
|
``` |
|
|
|
|
|
## Topic overview |
|
|
|
|
|
* Number of topics: 26 |
|
|
* Number of training documents: 10000 |
|
|
|
|
|
<details> |
|
|
<summary>Click here for an overview of all topics.</summary> |
|
|
|
|
|
| Topic ID | Topic Keywords | Topic Frequency | Label | |
|
|
|----------|----------------|-----------------|-------| |
|
|
| -1 | place - good - food - great - like | 82 | -1_place_good_food_great | |
|
|
| 0 | store - like - car - just - great | 3133 | 0_store_like_car_just | |
|
|
| 1 | mexican - tacos - food - salsa - good | 876 | 1_mexican_tacos_food_salsa | |
|
|
| 2 | good - ordered - just - food - cheese | 725 | 2_good_ordered_just_food | |
|
|
| 3 | pizza - good - crust - place - great | 568 | 3_pizza_good_crust_place | |
|
|
| 4 | food - great - place - good - service | 536 | 4_food_great_place_good | |
|
|
| 5 | hotel - room - pool - stay - airport | 445 | 5_hotel_room_pool_stay | |
|
|
| 6 | burger - fries - burgers - good - like | 370 | 6_burger_fries_burgers_good | |
|
|
| 7 | hair - dr - massage - nails - time | 339 | 7_hair_dr_massage_nails | |
|
|
| 8 | coffee - starbucks - place - good - like | 269 | 8_coffee_starbucks_place_good | |
|
|
| 9 | scottsdale - food - place - great - good | 262 | 9_scottsdale_food_place_great | |
|
|
| 10 | sushi - roll - rolls - place - good | 258 | 10_sushi_roll_rolls_place | |
|
|
| 11 | minutes - food - table - just - time | 241 | 11_minutes_food_table_just | |
|
|
| 12 | ice - ice cream - cream - cupcakes - cupcake | 224 | 12_ice_ice cream_cream_cupcakes | |
|
|
| 13 | breakfast - pancakes - eggs - good - place | 200 | 13_breakfast_pancakes_eggs_good | |
|
|
| 14 | thai - curry - pad - food - pad thai | 199 | 14_thai_curry_pad_food | |
|
|
| 15 | bbq - phoenix - food - brisket - good | 199 | 15_bbq_phoenix_food_brisket | |
|
|
| 16 | beer - place - great - beers - food | 179 | 16_beer_place_great_beers | |
|
|
| 17 | service - food - good - place - order | 152 | 17_service_food_good_place | |
|
|
| 18 | pho - vietnamese - broth - spring - spring rolls | 147 | 18_pho_vietnamese_broth_spring | |
|
|
| 19 | bar - music - place - night - cool | 106 | 19_bar_music_place_night | |
|
|
| 20 | chinese - chinese food - food - soup - chicken | 101 | 20_chinese_chinese food_food_soup | |
|
|
| 21 | greek - vegan - meat - food - place | 100 | 21_greek_vegan_meat_food | |
|
|
| 22 | theater - movie - seats - movies - amc | 99 | 22_theater_movie_seats_movies | |
|
|
| 23 | chinese - food - chinese food - pei - pei wei | 95 | 23_chinese_food_chinese food_pei | |
|
|
| 24 | bar - good - food - happy - great | 95 | 24_bar_good_food_happy | |
|
|
|
|
|
</details> |
|
|
|
|
|
## Training hyperparameters |
|
|
|
|
|
* calculate_probabilities: True |
|
|
* language: None |
|
|
* low_memory: False |
|
|
* min_topic_size: 10 |
|
|
* n_gram_range: (1, 1) |
|
|
* nr_topics: None |
|
|
* seed_topic_list: None |
|
|
* top_n_words: 5 |
|
|
* verbose: True |
|
|
* zeroshot_min_similarity: 0.7 |
|
|
* zeroshot_topic_list: None |
|
|
|
|
|
## Framework versions |
|
|
|
|
|
* Numpy: 1.24.3 |
|
|
* HDBSCAN: 0.8.33 |
|
|
* UMAP: 0.5.5 |
|
|
* Pandas: 2.0.3 |
|
|
* Scikit-Learn: 1.3.0 |
|
|
* Sentence-transformers: 2.2.2 |
|
|
* Transformers: 4.32.1 |
|
|
* Numba: 0.58.1 |
|
|
* Plotly: 5.9.0 |
|
|
* Python: 3.11.5 |
|
|
|