parliament_topic_model

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("daniel-023/parliament_topic_model")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 20
  • Number of training documents: 2005
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 minister - singapore - member - time - government 16 -1_minister_singapore_member_time
0 education - teachers - schools - school - minister 541 0_education_teachers_schools_school
1 water - reclamation - land - development - minister 210 1_water_reclamation_land_development
2 million - singapore - government - finance - year 202 2_million_singapore_government_finance
3 service - police - national - minister - officers 187 3_service_police_national_minister
4 law - council - house - members - committee 140 4_law_council_house_members
5 singapore - identity - citizenship - minister - cards 112 5_singapore_identity_citizenship_minister
6 bus - buses - taxis - transport - taxi 88 6_bus_buses_taxis_transport
7 property - land - tax - board - flats 81 7_property_land_tax_board
8 farmers - prices - minister - price - production 79 8_farmers_prices_minister_price
9 singapore - people - countries - government - foreign 70 9_singapore_people_countries_government
10 culture - cultural - programmes - films - people 49 10_culture_cultural_programmes_films
11 abortion - abortions - family - medical - women 48 11_abortion_abortions_family_medical
12 fund - pension - citizenship - age - years 38 12_fund_pension_citizenship_age
13 airport - telephone - passengers - singapore - terminal 37 13_airport_telephone_passengers_singapore
14 sports - games - national - singapore - national sports 29 14_sports_games_national_singapore
15 drug - drugs - medicines - advertisements - medical 24 15_drug_drugs_medicines_advertisements
16 health - mosquitoes - mosquito - hawkers - rubbish 20 16_health_mosquitoes_mosquito_hawkers
17 brigade - sports - minister - station - firefighting 17 17_brigade_sports_minister_station
18 hawkers - market - hawker - stalls - markets 17 18_hawkers_market_hawker_stalls

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 20
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.26.4
  • HDBSCAN: 0.8.37
  • UMAP: 0.5.5
  • Pandas: 2.2.0
  • Scikit-Learn: 1.4.1.post1
  • Sentence-transformers: 2.4.0
  • Transformers: 4.43.3
  • Numba: 0.60.0
  • Plotly: 5.23.0
  • Python: 3.12.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support