| |
|
| | --- |
| | tags: |
| | - bertopic |
| | library_name: bertopic |
| | pipeline_tag: text-classification |
| | --- |
| | |
| | # MARTINI_enrich_BERTopic_RogerHodkinson |
| | |
| | This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. |
| | BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. |
| | |
| | ## Usage |
| | |
| | To use this model, please install BERTopic: |
| | |
| | ``` |
| | pip install -U bertopic |
| | ``` |
| | |
| | You can use the model as follows: |
| | |
| | ```python |
| | from bertopic import BERTopic |
| | topic_model = BERTopic.load("AIDA-UPM/MARTINI_enrich_BERTopic_RogerHodkinson") |
| | |
| | topic_model.get_topic_info() |
| | ``` |
| | |
| | ## Topic overview |
| | |
| | * Number of topics: 23 |
| | * Number of training documents: 2203 |
| | |
| | <details> |
| | <summary>Click here for an overview of all topics.</summary> |
| | |
| | | Topic ID | Topic Keywords | Topic Frequency | Label | |
| | |----------|----------------|-----------------|-------| |
| | | -1 | pfizer - fauci - vaccinated - lockdowns - published | 20 | -1_pfizer_fauci_vaccinated_lockdowns | |
| | | 0 | fauci - virologists - conspiracy - laboratories - whistleblower | 1174 | 0_fauci_virologists_conspiracy_laboratories | |
| | | 1 | pandemics - ghebreyesus - trudeau - sovereignty - iran | 98 | 1_pandemics_ghebreyesus_trudeau_sovereignty | |
| | | 2 | vaccinated - twindemic - bivalent - booster - updated | 93 | 2_vaccinated_twindemic_bivalent_booster | |
| | | 3 | vaccinations - unvaccinated - dtap - rotavirus - infant | 78 | 3_vaccinations_unvaccinated_dtap_rotavirus | |
| | | 4 | masks - washington - vaccination - stanford - exemptions | 66 | 4_masks_washington_vaccination_stanford | |
| | | 5 | myopericarditis - nuvaxovid - physicians - lymphocytic - fatal | 64 | 5_myopericarditis_nuvaxovid_physicians_lymphocytic | |
| | | 6 | coroners - cv19 - died - worldwide - 2021 | 61 | 6_coroners_cv19_died_worldwide | |
| | | 7 | newsom - misinformation - physicians - inoculated - astrazeneca | 59 | 7_newsom_misinformation_physicians_inoculated | |
| | | 8 | infodemic - reclaimthenet - censored - zuckerberg - agencies | 54 | 8_infodemic_reclaimthenet_censored_zuckerberg | |
| | | 9 | longcovid - lingering - vax - symptoms - sufferers | 50 | 9_longcovid_lingering_vax_symptoms | |
| | | 10 | lockdown - china - zhengzhou - sars - wechat | 43 | 10_lockdown_china_zhengzhou_sars | |
| | | 11 | pregnant - miscarriages - pfizer - placental - multiparous | 42 | 11_pregnant_miscarriages_pfizer_placental | |
| | | 12 | ivermectin - fda - penicillin - cuomo - miracle | 40 | 12_ivermectin_fda_penicillin_cuomo | |
| | | 13 | plasmidgate - modrna - polio - snapgene - contaminated | 36 | 13_plasmidgate_modrna_polio_snapgene | |
| | | 14 | pfizer - whistleblower - paxton - quillivant - lawsuit | 33 | 14_pfizer_whistleblower_paxton_quillivant | |
| | | 15 | fluoxetine - drugmaker - lilly - shortages - mandrola | 33 | 15_fluoxetine_drugmaker_lilly_shortages | |
| | | 16 | masks - plastic - waste - expose - diapers | 31 | 16_masks_plastic_waste_expose | |
| | | 17 | military - mandated - discharged - exemptions - whistleblowers | 30 | 17_military_mandated_discharged_exemptions | |
| | | 18 | oncologists - brca - leukemias - p53 - lymphocytes | 27 | 18_oncologists_brca_leukemias_p53 | |
| | | 19 | therealanthonyfauci - rfk - joe - shootings - debaters | 27 | 19_therealanthonyfauci_rfk_joe_shootings | |
| | | 20 | clots - hypercoagulation - vaccinated - pegylated - embalmed | 23 | 20_clots_hypercoagulation_vaccinated_pegylated | |
| | | 21 | euthanasia - remdesivir - midazolam - ventilator - murdered | 21 | 21_euthanasia_remdesivir_midazolam_ventilator | |
| | |
| | </details> |
| | |
| | ## Training hyperparameters |
| | |
| | * calculate_probabilities: True |
| | * language: None |
| | * low_memory: False |
| | * min_topic_size: 10 |
| | * n_gram_range: (1, 1) |
| | * nr_topics: None |
| | * seed_topic_list: None |
| | * top_n_words: 10 |
| | * verbose: False |
| | * zeroshot_min_similarity: 0.7 |
| | * zeroshot_topic_list: None |
| | |
| | ## Framework versions |
| | |
| | * Numpy: 1.26.4 |
| | * HDBSCAN: 0.8.40 |
| | * UMAP: 0.5.7 |
| | * Pandas: 2.2.3 |
| | * Scikit-Learn: 1.5.2 |
| | * Sentence-transformers: 3.3.1 |
| | * Transformers: 4.46.3 |
| | * Numba: 0.60.0 |
| | * Plotly: 5.24.1 |
| | * Python: 3.10.12 |
| | |