--- tags: - bertopic library_name: bertopic pipeline_tag: text-classification --- # am25_abstract_topic_model This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. ## Usage To use this model, please install BERTopic: ``` pip install -U bertopic ``` You can use the model as follows: ```python from bertopic import BERTopic topic_model = BERTopic.load("djordan/am25_abstract_topic_model") topic_model.get_topic_info() ``` ## Topic overview * Number of topics: 171 * Number of training documents: 7863
Click here for an overview of all topics. | Topic ID | Topic Keywords | Topic Frequency | Label | |----------|----------------|-----------------|-------| | -1 | and - the - of - in - to | 5 | -1_and_the_of_in | | 0 | adc - adcs - dxd - payload - her2 | 3431 | 0_adc_adcs_dxd_payload | | 1 | kras - ras - g12c - mutant - g12d | 196 | 1_kras_ras_g12c_mutant | | 2 | ici - patients - immune - response - responders | 179 | 2_ici_patients_immune_response | | 3 | health - care - women - among - black | 165 | 3_health_care_women_among | | 4 | aml - leukemia - myeloid - venetoclax - acute | 150 | 4_aml_leukemia_myeloid_venetoclax | | 5 | gbm - glioblastoma - brain - glioma - tmz | 140 | 5_gbm_glioblastoma_brain_glioma | | 6 | pd - l1 - anti - ccr8 - antibody | 135 | 6_pd_l1_anti_ccr8 | | 7 | parpi - parp - usp1 - dna - repair | 100 | 7_parpi_parp_usp1_dna | | 8 | car - cells - cd19 - antigen - cell | 93 | 8_car_cells_cd19_antigen | | 9 | egfr - osimertinib - resistance - tkis - tki | 92 | 9_egfr_osimertinib_resistance_tkis | | 10 | tnbc - breast - triple - mda - negative | 86 | 10_tnbc_breast_triple_mda | | 11 | pdos - organoids - organoid - drug - 3d | 84 | 11_pdos_organoids_organoid_drug | | 12 | pdac - pancreatic - basal - ductal - classical | 77 | 12_pdac_pancreatic_basal_ductal | | 13 | cfdna - methylation - samples - detection - urine | 62 | 13_cfdna_methylation_samples_detection | | 14 | ar - enzalutamide - prostate - androgen - resistant | 60 | 14_ar_enzalutamide_prostate_androgen | | 15 | dose - pts - mg - safety - pk | 58 | 15_dose_pts_mg_safety | | 16 | glucose - glutamine - mitochondrial - metabolism - metabolic | 56 | 16_glucose_glutamine_mitochondrial_metabolism | | 17 | hcc - liver - sorafenib - hepatocellular - lenvatinib | 52 | 17_hcc_liver_sorafenib_hepatocellular | | 18 | variants - ffpe - sequencing - variant - samples | 49 | 18_variants_ffpe_sequencing_variant | | 19 | microbiome - microbial - bacterial - bacteria - microbiota | 49 | 19_microbiome_microbial_bacterial_bacteria | | 20 | spatial - tissue - imaging - plex - image | 48 | 20_spatial_tissue_imaging_plex | | 21 | sclc - elavl4 - hnf4a - lung - ne | 48 | 21_sclc_elavl4_hnf4a_lung | | 22 | mice - human - humanized - mouse - engraftment | 46 | 22_mice_human_humanized_mouse | | 23 | bone - os - metastasis - osteosarcoma - metastatic | 45 | 23_bone_os_metastasis_osteosarcoma | | 24 | braf - tead - raf - melanoma - mek | 44 | 24_braf_tead_raf_melanoma | | 25 | pca - prostate - psa - gleason - men | 44 | 25_pca_prostate_psa_gleason | | 26 | cldn18 - cldn6 - claudin - cldn1 - cldn3 | 41 | 26_cldn18_cldn6_claudin_cldn1 | | 27 | psma - 177lu - fap - uptake - 68ga | 41 | 27_psma_177lu_fap_uptake | | 28 | mtap - prmt5 - mta - deleted - cooperative | 40 | 28_mtap_prmt5_mta_deleted | | 29 | mm - myeloma - bone - cd38 - cst6 | 40 | 29_mm_myeloma_bone_cd38 | | 30 | ecdna - somatic - skin - genome - mutational | 37 | 30_ecdna_somatic_skin_genome | | 31 | incidence - risk - cancer - exposure - lifestyle | 37 | 31_incidence_risk_cancer_exposure | | 32 | tce - cd3 - tces - gd - engagers | 36 | 32_tce_cd3_tces_gd | | 33 | ctdna - mrd - recurrence - patients - months | 36 | 33_ctdna_mrd_recurrence_patients | | 34 | ctcs - ctc - blood - biopsy - v7 | 35 | 34_ctcs_ctc_blood_biopsy | | 35 | capsaicin - vialinin - apoptosis - apoptotic - compounds | 35 | 35_capsaicin_vialinin_apoptosis_apoptotic | | 36 | crc - apc - wnt - colonic - intestinal | 34 | 36_crc_apc_wnt_colonic | | 37 | ebv - npc - hpv - nasopharyngeal - hnscc | 34 | 37_ebv_npc_hpv_nasopharyngeal | | 38 | pdac - pancreatic - nets - tme - immunosuppressive | 34 | 38_pdac_pancreatic_nets_tme | | 39 | spatial - resolution - transcriptomics - tissue - xenium | 34 | 39_spatial_resolution_transcriptomics_tissue | | 40 | cafs - caf - fibroblasts - axl - gc | 33 | 40_cafs_caf_fibroblasts_axl | | 41 | test - abstract - text - you - your | 33 | 41_test_abstract_text_you | | 42 | p53 - y220c - ddr - dna - repair | 33 | 42_p53_y220c_ddr_dna | | 43 | data - ai - 500 - datasets - research | 33 | 43_data_ai_500_datasets | | 44 | luad - lung - xage1 - znf687 - lusc | 32 | 44_luad_lung_xage1_znf687 | | 45 | variants - brca1 - chek2 - bc - germline | 29 | 45_variants_brca1_chek2_bc | | 46 | sting - agonist - cgas - interferon - activation | 29 | 46_sting_agonist_cgas_interferon | | 47 | pdt - light - elp - nanoparticles - ph | 28 | 47_pdt_light_elp_nanoparticles | | 48 | il - 12 - obp - 702 - tumor | 28 | 48_il_12_obp_702 | | 49 | ccrcc - rcc - renal - vhl - carcinoma | 28 | 49_ccrcc_rcc_renal_vhl | | 50 | vaccines - vaccine - neoantigen - mrna - peptides | 26 | 50_vaccines_vaccine_neoantigen_mrna | | 51 | notch4 - dormancy - evs - e7011 - exosomes | 26 | 51_notch4_dormancy_evs_e7011 | | 52 | slides - images - model - wsi - slide | 26 | 52_slides_images_model_wsi | | 53 | smarca4 - smarca2 - smarca1 - 3236 - smd | 26 | 53_smarca4_smarca2_smarca1_3236 | | 54 | cytof - spectral - cytometry - flow - xt | 24 | 54_cytof_spectral_cytometry_flow | | 55 | mb - medulloblastoma - shh - nesc - tert | 23 | 55_mb_medulloblastoma_shh_nesc | | 56 | cca - cholangiocarcinoma - bile - postn - duct | 22 | 56_cca_cholangiocarcinoma_bile_postn | | 57 | ezh2 - ezh1 - fads2 - prc2 - h3k27me3 | 22 | 57_ezh2_ezh1_fads2_prc2 | | 58 | wrn - msi - helicase - gsk4418959 - hro761 | 22 | 58_wrn_msi_helicase_gsk4418959 | | 59 | ews - fli1 - ewing - ewsr1 - sarcoma | 22 | 59_ews_fli1_ewing_ewsr1 | | 60 | cdk4 - 6i - resistant - resistance - er | 21 | 60_cdk4_6i_resistant_resistance | | 61 | pdac - gemcitabine - pikfyve - pancreatic - metabolic | 21 | 61_pdac_gemcitabine_pikfyve_pancreatic | | 62 | nb - mycn - neuroblastoma - 17q - gd2 | 21 | 62_nb_mycn_neuroblastoma_17q | | 63 | macrophages - m1 - m2 - macrophage - tams | 20 | 63_macrophages_m1_m2_macrophage | | 64 | egfr - bispecific - her3 - cmet - adc | 20 | 64_egfr_bispecific_her3_cmet | | 65 | discovery - drug - library - covalent - hit | 20 | 65_discovery_drug_library_covalent | | 66 | ferroptosis - gpx4 - peroxidation - ferroptotic - lipid | 20 | 66_ferroptosis_gpx4_peroxidation_ferroptotic | | 67 | hnscc - hpv - fst - oscc - cyh33 | 19 | 67_hnscc_hpv_fst_oscc | | 68 | drug - predictive - drugs - framework - enlight | 19 | 68_drug_predictive_drugs_framework | | 69 | bcma - car - gprc5d - mm - cel | 19 | 69_bcma_car_gprc5d_mm | | 70 | ackr1 - extravasation - metastatic - niche - endothelial | 18 | 70_ackr1_extravasation_metastatic_niche | | 71 | gut - microbiome - microbiota - fmt - ici | 18 | 71_gut_microbiome_microbiota_fmt | | 72 | cachexia - muscle - senescent - fisetin - gdf15 | 17 | 72_cachexia_muscle_senescent_fisetin | | 73 | egfr - nsclc - egfrm - tki - mutations | 17 | 73_egfr_nsclc_egfrm_tki | | 74 | icg - imaging - sln - nir - fluorescence | 17 | 74_icg_imaging_sln_nir | | 75 | e3 - degradation - ligase - protacs - protac | 17 | 75_e3_degradation_ligase_protacs | | 76 | pik3ca - pi3ka - alpelisib - pi3k - mutant | 17 | 76_pik3ca_pi3ka_alpelisib_pi3k | | 77 | oncokb - variants - variant - oncotagger - somatic | 17 | 77_oncokb_variants_variant_oncotagger | | 78 | ffpe - rna - samples - seq - fixed | 17 | 78_ffpe_rna_samples_seq | | 79 | copd - risk - proteins - igfbp7 - mortality | 17 | 79_copd_risk_proteins_igfbp7 | | 80 | hpv - opscc - pwh - infection - hiv | 16 | 80_hpv_opscc_pwh_infection | | 81 | dietary - intake - food - risk - plant | 16 | 81_dietary_intake_food_risk | | 82 | er - mcf - endocrine - estrogen - e2 | 16 | 82_er_mcf_endocrine_estrogen | | 83 | pkmyt1 - wee1 - ccne1 - lunresertib - cdk1 | 16 | 83_pkmyt1_wee1_ccne1_lunresertib | | 84 | lcs - screening - lung - sdm - risk | 15 | 84_lcs_screening_lung_sdm | | 85 | cdh17 - cadherin - 054 - lbl - gi | 15 | 85_cdh17_cadherin_054_lbl | | 86 | rms - fp - foxo1 - p3f - pax3 | 14 | 86_rms_fp_foxo1_p3f | | 87 | myc - mycg4 - g4 - nucleolin - ddx5 | 14 | 87_myc_mycg4_g4_nucleolin | | 88 | eac - ec - esophageal - pro - rkp | 14 | 88_eac_ec_esophageal_pro | | 89 | hdac3 - hdac - gem144 - hdac8 - hdaci | 14 | 89_hdac3_hdac_gem144_hdac8 | | 90 | culture - organoids - immune - co - tios | 14 | 90_culture_organoids_immune_co | | 91 | ttfields - fields - dox - concomitant - electric | 13 | 91_ttfields_fields_dox_concomitant | | 92 | cdk2 - ccne1 - cdk4 - cyclin - amplified | 13 | 92_cdk2_ccne1_cdk4_cyclin | | 93 | cd73 - adenosine - a2ar - cd68 - immune | 13 | 93_cd73_adenosine_a2ar_cd68 | | 94 | ilc - cdh1 - tfap2b - breast - lobular | 13 | 94_ilc_cdh1_tfap2b_breast | | 95 | btk - nx - 5948 - lymphoma - c481s | 13 | 95_btk_nx_5948_lymphoma | | 96 | hrd - hrr - biallelic - recombination - homologous | 13 | 96_hrd_hrr_biallelic_recombination | | 97 | runx3 - paint - pkp3 - snord67 - 3q | 13 | 97_runx3_paint_pkp3_snord67 | | 98 | kat6a - kat6 - er - kat6b - breast | 12 | 98_kat6a_kat6_er_kat6b | | 99 | lncrnas - coding - ner - uterine - lncrna | 12 | 99_lncrnas_coding_ner_uterine | | 100 | bca - numb - bladder - rock - muscle | 12 | 100_bca_numb_bladder_rock | | 101 | vaccination - hpv - vaccine - hesitancy - covid | 12 | 101_vaccination_hpv_vaccine_hesitancy | | 102 | blca - bladder - fgfr3 - mibc - nmibc | 12 | 102_blca_bladder_fgfr3_mibc | | 103 | lnp - lnps - formulation - dsrna - lipid | 12 | 103_lnp_lnps_formulation_dsrna | | 104 | germline - variants - pathogenic - ddx41 - read | 11 | 104_germline_variants_pathogenic_ddx41 | | 105 | age - aged - aging - young - mice | 11 | 105_age_aged_aging_young | | 106 | pdx - hci - models - hbcu - drug | 11 | 106_pdx_hci_models_hbcu | | 107 | obesity - butyrate - diet - fto - obese | 11 | 107_obesity_butyrate_diet_fto | | 108 | hpk1 - hdm2006 - 306 - s109 - ubx | 10 | 108_hpk1_hdm2006_306_s109 | | 109 | ldrt - metabolic - cd8 - lactylation - tcredcd39koher2 | 10 | 109_ldrt_metabolic_cd8_lactylation | | 110 | hypoxia - hypoxic - hif1a - mhc1pp - ifn | 10 | 110_hypoxia_hypoxic_hif1a_mhc1pp | | 111 | cd47 - sirpa - smagp - avfc - imc | 10 | 111_cd47_sirpa_smagp_avfc | | 112 | nepc - prostate - pik3r1 - ceacam5 - ar | 10 | 112_nepc_prostate_pik3r1_ceacam5 | | 113 | eif4e - translation - cap - eif4f - ovarian | 9 | 113_eif4e_translation_cap_eif4f | | 114 | ev - evs - mgm - plasma - biomarkers | 9 | 114_ev_evs_mgm_plasma | | 115 | ipro - prediction - performance - rpslearner - ct | 9 | 115_ipro_prediction_performance_rpslearner | | 116 | tf - xb371 - adce - uparap - coagulation | 9 | 116_tf_xb371_adce_uparap | | 117 | icis - ali - cish - anti - lag | 9 | 117_icis_ali_cish_anti | | 118 | nicotine - cigarette - memantine - bw813u - smoking | 9 | 118_nicotine_cigarette_memantine_bw813u | | 119 | nnmt - dnmt1 - stm9005 - mettl1 - rrm1 | 9 | 119_nnmt_dnmt1_stm9005_mettl1 | | 120 | eps - states - state - single - sub | 9 | 120_eps_states_state_single | | 121 | gastric - gc - tsrna - eo - cops5 | 9 | 121_gastric_gc_tsrna_eo | | 122 | risk - women - bbd - breast - missing | 9 | 122_risk_women_bbd_breast | | 123 | h7 - bispecific - b7 - npx372 - tim | 9 | 123_h7_bispecific_b7_npx372 | | 124 | nat - rectal - course - neoadjuvant - ild | 9 | 124_nat_rectal_course_neoadjuvant | | 125 | xpo1 - hsp90 - xpr1 - selinexor - slc34a2 | 9 | 125_xpo1_hsp90_xpr1_selinexor | | 126 | p2x4 - pca - sqle - crisp3 - cxcr7 | 9 | 126_p2x4_pca_sqle_crisp3 | | 127 | ripk1 - lig1 - ctps2 - cisplatin - lig1het | 8 | 127_ripk1_lig1_ctps2_cisplatin | | 128 | age - dnam - risk - cpg - mage | 8 | 128_age_dnam_risk_cpg | | 129 | women - breast - lrig1 - duffy - bpe | 8 | 129_women_breast_lrig1_duffy | | 130 | nectin - ev - uc - glr1059 - iph4502 | 8 | 130_nectin_ev_uc_glr1059 | | 131 | ros1 - egfr - tkd - nsclc - zongertinib | 8 | 131_ros1_egfr_tkd_nsclc | | 132 | abd147 - clickable - binder - 225ac - capac | 8 | 132_abd147_clickable_binder_225ac | | 133 | 34a - mir - endosomal - fm - nigericin | 8 | 133_34a_mir_endosomal_fm | | 134 | tcr - tcrs - prame - hla - supercharged | 8 | 134_tcr_tcrs_prame_hla | | 135 | spatial - l2 - immune - geomx - microenvironment | 8 | 135_spatial_l2_immune_geomx | | 136 | sedentary - physical - 93 - able - spent | 8 | 136_sedentary_physical_93_able | | 137 | fulvestrant - pts - bireociclib - endocrine - cdk4 | 8 | 137_fulvestrant_pts_bireociclib_endocrine | | 138 | mal - trials - dose - oncology - cost | 8 | 138_mal_trials_dose_oncology | | 139 | adar1 - editing - p150 - rna - ribi | 8 | 139_adar1_editing_p150_rna | | 140 | adulthood - bmi - bri - alcohol - selenium | 8 | 140_adulthood_bmi_bri_alcohol | | 141 | ctdna - ddpcr - mutations - plasma - monitoring | 8 | 141_ctdna_ddpcr_mutations_plasma | | 142 | cadonilimab - bnt116 - safety - resectable - penpulimab | 7 | 142_cadonilimab_bnt116_safety_resectable | | 143 | irf4 - tbxt - persistence - resistant - drug | 7 | 143_irf4_tbxt_persistence_resistant | | 144 | btz - pi - proteasome - mm - ceritinib | 7 | 144_btz_pi_proteasome_mm | | 145 | nps - til - tgfb - brg399 - helios | 7 | 145_nps_til_tgfb_brg399 | | 146 | lymphotoxin - hnscc - cd24 - il - ctla2a | 7 | 146_lymphotoxin_hnscc_cd24_il | | 147 | kif18a - cin - mitotic - yf550 - hw221043 | 7 | 147_kif18a_cin_mitotic_yf550 | | 148 | nad - nampt - nmn - ot - 82 | 7 | 148_nad_nampt_nmn_ot | | 149 | arid1a - arid1b - swi - snf - eo3001 | 7 | 149_arid1a_arid1b_swi_snf | | 150 | ptpn1 - all - bcp - nhd13 - splicing | 7 | 150_ptpn1_all_bcp_nhd13 | | 151 | allo - asct - mm - hct - pem | 7 | 151_allo_asct_mm_hct | | 152 | flc - dnaj - pkac - fibrolamellar - surgery | 7 | 152_flc_dnaj_pkac_fibrolamellar | | 153 | telomerase - clpxp - telomere - g4 - clpx | 7 | 153_telomerase_clpxp_telomere_g4 | | 154 | pc53k - tie2 - ku - yb - ovarian | 6 | 154_pc53k_tie2_ku_yb | | 155 | hydrogel - ecm - decm - matrix - kyse30 | 6 | 155_hydrogel_ecm_decm_matrix | | 156 | msln - rc88 - zw171 - binding - 08052666 | 6 | 156_msln_rc88_zw171_binding | | 157 | cachexia - muscle - edema - sma - adiposity | 6 | 157_cachexia_muscle_edema_sma | | 158 | emb - wx390 - dcr - mcrc - orr | 6 | 158_emb_wx390_dcr_mcrc | | 159 | neoantigens - frameshift - antigens - hla - as10 | 6 | 159_neoantigens_frameshift_antigens_hla | | 160 | smip34 - rlip - atovaquone - eoc - cddp | 6 | 160_smip34_rlip_atovaquone_eoc | | 161 | rd3 - sided - colorectal - polyps - left | 5 | 161_rd3_sided_colorectal_polyps | | 162 | onc212 - onc206 - onc201 - atg101 - imipridones | 5 | 162_onc212_onc206_onc201_atg101 | | 163 | hcc - gzmk - 37 - ph102 - foxp3high | 5 | 163_hcc_gzmk_37_ph102 | | 164 | vitae - nec - nunc - id - sed | 5 | 164_vitae_nec_nunc_id | | 165 | emphysematous - ct - group - ca - recurrence | 5 | 165_emphysematous_ct_group_ca | | 166 | til - stim - reactive - feeder - obx | 5 | 166_til_stim_reactive_feeder | | 167 | fao - atp - kn510713 - cac - acaa1 | 5 | 167_fao_atp_kn510713_cac | | 168 | 3d - 2d - pathology - specimen - sections | 5 | 168_3d_2d_pathology_specimen | | 169 | radiation - flash - ray - fr - kvp | 5 | 169_radiation_flash_ray_fr |
## Training hyperparameters * calculate_probabilities: False * language: None * low_memory: False * min_topic_size: 10 * n_gram_range: (1, 1) * nr_topics: None * seed_topic_list: None * top_n_words: 10 * verbose: True * zeroshot_min_similarity: 0.7 * zeroshot_topic_list: None ## Framework versions * Numpy: 1.26.4 * HDBSCAN: 0.8.40 * UMAP: 0.5.7 * Pandas: 2.2.2 * Scikit-Learn: 1.6.1 * Sentence-transformers: 3.4.1 * Transformers: 4.48.2 * Numba: 0.61.0 * Plotly: 5.24.1 * Python: 3.11.11