--- tags: - bertopic library_name: bertopic pipeline_tag: text-classification --- # BERTopic-gemini-summary This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. ## Usage To use this model, please install BERTopic: ``` pip install -U bertopic ``` You can use the model as follows: ```python from bertopic import BERTopic topic_model = BERTopic.load("nataliecastro/BERTopic-gemini-summary") topic_model.get_topic_info() ``` ## Topic overview * Number of topics: 37 * Number of training documents: 3500
Click here for an overview of all topics. | Topic ID | Topic Keywords | Topic Frequency | Label | |----------|----------------|-----------------|-------| | -1 | the - and - to - of - for | 13 | -1_the_and_to_of | | 0 | the - to - of - that - budget | 1043 | 0_the_to_of_that | | 1 | the - to - and - of - that | 625 | 1_the_to_and_of | | 2 | thompson - the - school - and - of | 432 | 2_thompson_the_school_and | | 3 | westminster - swanson - the - of - dr | 125 | 3_westminster_swanson_the_of | | 4 | students - and - college - to - the | 118 | 4_students_and_college_to | | 5 | and - the - you - to - of | 114 | 5_and_the_you_to | | 6 | math - the - data - in - performance | 95 | 6_math_the_data_in | | 7 | anderson - the - boulder - and - of | 80 | 7_anderson_the_boulder_and | | 8 | board - the - that - policy - to | 72 | 8_board_the_that_policy | | 9 | the - and - program - to - of | 59 | 9_the_and_program_to | | 10 | bilingual - language - the - and - of | 53 | 10_bilingual_language_the_and | | 11 | the - and - salary - to - of | 49 | 11_the_and_salary_to | | 12 | the - board - of - and - greeley | 45 | 12_the_board_of_and | | 13 | sros - sro - of - to - and | 42 | 13_sros_sro_of_to | | 14 | technology - policy - use - to - privacy | 40 | 14_technology_policy_use_to | | 15 | and - schools - the - principal - school | 39 | 15_and_schools_the_principal | | 16 | that - we - the - and - to | 37 | 16_that_we_the_and | | 17 | audit - financial - the - statements - of | 33 | 17_audit_financial_the_statements | | 18 | athletic - coach - the - and - team | 33 | 18_athletic_coach_the_and | | 19 | security - safety - and - of - the | 31 | 19_security_safety_and_of | | 20 | technology - devices - to - the - of | 31 | 20_technology_devices_to_the | | 21 | transgender - lgbtq - and - of - gender | 29 | 21_transgender_lgbtq_and_of | | 22 | sustainability - energy - waste - air - and | 26 | 22_sustainability_energy_waste_air | | 23 | sel - and - students - the - mental | 25 | 23_sel_and_students_the | | 24 | dyslexia - screener - reading - the - with | 24 | 24_dyslexia_screener_reading_the | | 25 | bullying - and - to - the - of | 20 | 25_bullying_and_to_the | | 26 | greeley - the - scholarship - school - of | 20 | 26_greeley_the_scholarship_school | | 27 | calendar - the - days - break - for | 20 | 27_calendar_the_days_break | | 28 | start - school - time - day - to | 18 | 28_start_school_time_day | | 29 | color - of - diversity - and - the | 17 | 29_color_of_diversity_and | | 30 | and - speaker - the - to - that | 17 | 30_and_speaker_the_to | | 31 | volunteers - the - and - of - their | 16 | 31_volunteers_the_and_of | | 32 | enrollment - growth - in - housing - the | 16 | 32_enrollment_growth_in_housing | | 33 | ix - title - complaint - harassment - sexual | 15 | 33_ix_title_complaint_harassment | | 34 | marijuana - policy - medical - to - the | 14 | 34_marijuana_policy_medical_to | | 35 | pilch - dr - the - board - to | 14 | 35_pilch_dr_the_board |
## Training hyperparameters * calculate_probabilities: True * language: english * low_memory: False * min_topic_size: 10 * n_gram_range: (1, 1) * nr_topics: None * seed_topic_list: None * top_n_words: 10 * verbose: True * zeroshot_min_similarity: 0.7 * zeroshot_topic_list: None ## Framework versions * Numpy: 1.24.3 * HDBSCAN: 0.8.29 * UMAP: 0.5.6 * Pandas: 1.5.3 * Scikit-Learn: 1.2.2 * Sentence-transformers: 3.1.0 * Transformers: 4.44.2 * Numba: 0.57.0 * Plotly: 5.9.0 * Python: 3.10.12