bertopic_sim85_20topics_raw

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("DobreMihai/bertopic_sim85_20topics_raw")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 29
  • Number of training documents: 56774
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
0 loud - very - really - not - Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 loud
1 snooze - - - - Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 snooze
2 the - math - - - Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 math
3 premium - - - - Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 premium*
4 be - the - it - to - and Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 -1_be_the_it_to
5 app - the - be - alarm - to Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 0_app_the_be_alarm
6 good - nice - excellent - very - awesome Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 1_good_nice_excellent_very
7 work - easy - helpful - useful - very Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 2_work_easy_helpful_useful
8 ok - well - op - gud - one Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 3_ok_well_op_gud
9 hai - hi - nahi - nhi - se Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 4_hai_hi_nahi_nhi
10 ring - not - do - it - sometimes Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 5_ring_not_do_it
11 subscription - - - - Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 6_subscription___
12 aap - ap - good - nice - very Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 7_aap_ap_good_nice
13 star - late - give - because - time Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 8_star_late_give_because
14 pay - dependable - expensive - worth - money Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 9_pay_dependable_expensive_worth
15 que - la - de - para - es Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 10_que_la_de_para
16 life - change - save - saver - my Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 11_life_change_save_saver
17 experience - fail - never - fun - challenge Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 12_experience_fail_never_fun
18 osm - awsm - ossm - ossum - owsm Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 13_osm_awsm_ossm_ossum
19 annoying - suck - hate - it - complaint Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 14_annoying_suck_hate_it
20 vry - good - vgood - vv - app Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 15_vry_good_vgood_vv
21 nic - mst - nicr - nc - nicw Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 16_nic_mst_nicr_nc
22 s3 - galaxy - samsung - 71 - a71android Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 17_s3_galaxy_samsung_71
23 camera - - - - Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 18_camera___
24 subscription - - - - Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 19_subscription___
25 full - what - need - exactly - just Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 20_full_what_need_exactly
26 alam - alamy - alamr - good - spiritual Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 21_alam_alamy_alamr_good
27 supper - chef - kiss - alaram - app Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 22_supper_chef_kiss_alaram
28 go - just - for - it - thought Topic Count
4 5 20603
28 4 15906
5 6 11694
10 7 3149
6 8 1460
8 9 1031
20 10 649
14 11 379
11 12 373
22 13 213
16 14 197
9 15 183
15 16 171
12 17 164
23 18 147
7 19 105
18 20 59
13 21 49
17 22 47
25 23 45
24 24 43
21 25 36
0 0 27
19 26 15
26 27 13
27 28 11
1 1 3
2 2 1
3 3 1 23_go_just_for_it

Training hyperparameters

  • calculate_probabilities: False
  • language: None
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.26.4
  • HDBSCAN: 0.8.38.post1
  • UMAP: 0.5.6
  • Pandas: 2.2.1
  • Scikit-Learn: 1.5.2
  • Sentence-transformers: 3.1.0
  • Transformers: 4.44.2
  • Numba: 0.60.0
  • Plotly: 5.24.1
  • Python: 3.10.15
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support