transformers_issues_topics

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("Ruslan10/transformers_issues_topics")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 30
  • Number of training documents: 9000
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 bert - tensorflow - pytorch - pretrained - models 11 -1_bert_tensorflow_pytorch_pretrained
0 bertforsequenceclassification - encoderdecoder - berttokenizer - tokenizer - bert 2042 0_bertforsequenceclassification_encoderdecoder_berttokenizer_tokenizer
1 pytorch - tensorflow - modelingutilspy - tensors - tensor 1903 1_pytorch_tensorflow_modelingutilspy_tensors
2 seq2seqtrainer - seq2seq - testing - tests - prepareseq2seqbatch 735 2_seq2seqtrainer_seq2seq_testing_tests
3 docstring - docstrings - readmetxt - doc - readmemd 603 3_docstring_docstrings_readmetxt_doc
4 gpt2 - gpt2tokenizer - gpt2tokenizerfast - gpt - gpt2model 513 4_gpt2_gpt2tokenizer_gpt2tokenizerfast_gpt
5 trainertrain - trainer - trainers - training - evaluateduringtraining 442 5_trainertrain_trainer_trainers_training
6 modelcard - modelcards - card - model - models 440 6_modelcard_modelcards_card_model
7 albertforpretraining - xlnet - xlnetlmheadmodel - albertbasev2 - albertformaskedlm 436 7_albertforpretraining_xlnet_xlnetlmheadmodel_albertbasev2
8 t5 - t5model - tf - t5base - tf2 359 8_t5_t5model_tf_t5base
9 transformerscli - transformers - transformer - transformerxl - importerror 259 9_transformerscli_transformers_transformer_transformerxl
10 ner - pipeline - pipelines - nerpipeline - fillmaskpipeline 197 10_ner_pipeline_pipelines_nerpipeline
11 questionansweringpipeline - questionanswering - answering - tfalbertforquestionanswering - questionasnwering 159 11_questionansweringpipeline_questionanswering_answering_tfalbertforquestionanswering
12 longformer - longform - longformers - longformerlayer - longformerformultiplechoice 135 12_longformer_longform_longformers_longformerlayer
13 onnx - onnxonnxruntime - onnxexport - 04onnxexport - 04onnxexportipynb 117 13_onnx_onnxonnxruntime_onnxexport_04onnxexport
14 generationbeamsearchpy - generatebeamsearch - generatebeamsearchoutputs - beamsearch - nonbeamsearch 95 14_generationbeamsearchpy_generatebeamsearch_generatebeamsearchoutputs_beamsearch
15 benchmark - benchmarks - accuracy - evaluation - metrics 90 15_benchmark_benchmarks_accuracy_evaluation
16 huggingfacemaster - huggingfacetokenizers297 - huggingface - huggingfacetransformers - huggingfacetransformer 83 16_huggingfacemaster_huggingfacetokenizers297_huggingface_huggingfacetransformers
17 datacollatorforlanguagemodelingfile - datacollatorforlanguagemodeling - datacollatorforlanguagemodelling - datacollatorforpermutationlanguagemodeling - datacollatorfornextsentenceprediction 77 17_datacollatorforlanguagemodelingfile_datacollatorforlanguagemodeling_datacollatorforlanguagemodelling_datacollatorforpermutationlanguagemodeling
18 flax - flaxelectraformaskedlm - flaxelectraforpretraining - flaxjax - flaxelectramodel 52 18_flax_flaxelectraformaskedlm_flaxelectraforpretraining_flaxjax
19 notebook - notebooks - community - colab - t5 48 19_notebook_notebooks_community_colab
20 wandbproject - wandb - sagemaker - sagemakertrainer - wandbcallback 39 20_wandbproject_wandb_sagemaker_sagemakertrainer
21 cachedir - cache - cachedpath - caching - cached 34 21_cachedir_cache_cachedpath_caching
22 electra - electrapretrainedmodel - electraformaskedlm - electraformultiplechoice - electrafortokenclassification 32 22_electra_electrapretrainedmodel_electraformaskedlm_electraformultiplechoice
23 layoutlm - layout - layoutlmtokenizer - layoutlmbaseuncased - tf 23 23_layoutlm_layout_layoutlmtokenizer_layoutlmbaseuncased
24 dict - dictstr - returndict - parse - arguments 18 24_dict_dictstr_returndict_parse
25 pplm - pr - deprecated - variable - ppl 18 25_pplm_pr_deprecated_variable
26 isort - blackisortflake8 - github - repo - version 15 26_isort_blackisortflake8_github_repo
27 blenderbot - blenderbot3b - blenderbotforcausallm - bot - boto3 14 27_blenderbot_blenderbot3b_blenderbotforcausallm_bot
28 indexerror - index - missingindex - indices - runtimeerror 11 28_indexerror_index_missingindex_indices

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 30
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 2.0.2
  • HDBSCAN: 0.8.41
  • UMAP: 0.5.11
  • Pandas: 2.2.2
  • Scikit-Learn: 1.6.1
  • Sentence-transformers: 5.2.0
  • Transformers: 4.57.6
  • Numba: 0.60.0
  • Plotly: 5.24.1
  • Python: 3.12.12
Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support