| |
|
| | --- |
| | tags: |
| | - bertopic |
| | library_name: bertopic |
| | pipeline_tag: text-classification |
| | --- |
| | |
| | # BERTopic_v1_july |
| |
|
| | This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. |
| | BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. |
| |
|
| | ## Usage |
| |
|
| | To use this model, please install BERTopic: |
| |
|
| | ``` |
| | pip install -U bertopic |
| | ``` |
| |
|
| | You can use the model as follows: |
| |
|
| | ```python |
| | from bertopic import BERTopic |
| | topic_model = BERTopic.load("shantanudave/BERTopic_v1_july") |
| | |
| | topic_model.get_topic_info() |
| | ``` |
| |
|
| | ## Topic overview |
| |
|
| | * Number of topics: 18 |
| | * Number of training documents: 8526 |
| |
|
| | <details> |
| | <summary>Click here for an overview of all topics.</summary> |
| | |
| | | Topic ID | Topic Keywords | Topic Frequency | Label | |
| | |----------|----------------|-----------------|-------| |
| | | 0 | payment - pay - card - bank - money | 742 | Payment Issues Detection | |
| | | 1 | load - slow - search - article - doesnt | 705 | Slow Search Function | |
| | | 2 | clothes - clothing - size - fashion - large size | 683 | Large Size Quality Clothing | |
| | | 3 | bon - - - - | 668 | bon documents collection | |
| | | 4 | clear - intuitive - clear easy - recommend - selection | 665 | Easy Clear Navigation | |
| | | 5 | - - - - | 649 | Keyword-Driven Document Analysis | |
| | | 6 | shopping - staff - friendly - store - satisfy | 578 | Friendly staff satisfaction | |
| | | 7 | delivery - fast delivery - fast - shipping - ship | 563 | Fast Delivery Quality | |
| | | 8 | cart - shop cart - log - password - add | 548 | Shopping Cart Issues | |
| | | 9 | easy use - easy - use - use easy - quick easy | 531 | Quick & Easy Solutions | |
| | | 10 | awesome - excellent - think - clearly - phenomenal | 462 | Really Phenomenal Clear Thinking | |
| | | 11 | quality - price - quality quality - price quality - comfortable | 454 | Excellent Quality Price | |
| | | 12 | work work - work - work quickly - flawlessly - work flawlessly | 390 | Efficient Flawless Work | |
| | | 13 | super super - super - superb - superb super - super friendly | 349 | Superb Friendly Coat | |
| | | 14 | really simple - ra - solve problem - control - satisfied easy | 145 | User-Friendly Problem Solver | |
| | | 15 | clear clear - clear - fast clear - clear fast - super clear | 144 | Clear and Transparent Working | |
| | | 16 | discover - stuff good - stuff - fact - clearly | 129 | Discovering Interesting Facts | |
| | | 17 | satisfied - satisfaction - totally satisfied - satisfied good - completely satisfied | 121 | Utmost Satisfaction | |
| | |
| | </details> |
| |
|
| | ## Training hyperparameters |
| |
|
| | * calculate_probabilities: True |
| | * language: None |
| | * low_memory: False |
| | * min_topic_size: 10 |
| | * n_gram_range: (1, 1) |
| | * nr_topics: None |
| | * seed_topic_list: None |
| | * top_n_words: 10 |
| | * verbose: True |
| | * zeroshot_min_similarity: 0.7 |
| | * zeroshot_topic_list: None |
| | |
| | ## Framework versions |
| | |
| | * Numpy: 1.23.5 |
| | * HDBSCAN: 0.8.33 |
| | * UMAP: 0.5.5 |
| | * Pandas: 1.3.5 |
| | * Scikit-Learn: 1.4.1.post1 |
| | * Sentence-transformers: 2.6.1 |
| | * Transformers: 4.41.2 |
| | * Numba: 0.59.1 |
| | * Plotly: 5.22.0 |
| | * Python: 3.10.13 |
| | |