Thang203 commited on
Commit
cfdc927
·
verified ·
1 Parent(s): bcc949c

Add BERTopic model

Browse files
Files changed (6) hide show
  1. README.md +81 -0
  2. config.json +17 -0
  3. ctfidf.bin +3 -0
  4. ctfidf_config.json +0 -0
  5. topic_embeddings.bin +3 -0
  6. topics.json +2626 -0
README.md ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ tags:
4
+ - bertopic
5
+ library_name: bertopic
6
+ pipeline_tag: text-classification
7
+ ---
8
+
9
+ # industry-mar11
10
+
11
+ This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model.
12
+ BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
13
+
14
+ ## Usage
15
+
16
+ To use this model, please install BERTopic:
17
+
18
+ ```
19
+ pip install -U bertopic
20
+ ```
21
+
22
+ You can use the model as follows:
23
+
24
+ ```python
25
+ from bertopic import BERTopic
26
+ topic_model = BERTopic.load("Thang203/industry-mar11")
27
+
28
+ topic_model.get_topic_info()
29
+ ```
30
+
31
+ ## Topic overview
32
+
33
+ * Number of topics: 12
34
+ * Number of training documents: 516
35
+
36
+ <details>
37
+ <summary>Click here for an overview of all topics.</summary>
38
+
39
+ | Topic ID | Topic Keywords | Topic Frequency | Label |
40
+ |----------|----------------|-----------------|-------|
41
+ | -1 | models - language - data - large - language models | 51 | -1_models_language_data_large |
42
+ | 0 | multimodal - visual - image - models - generation | 169 | 0_multimodal_visual_image_models |
43
+ | 1 | speech - asr - text - speaker - recognition | 24 | 1_speech_asr_text_speaker |
44
+ | 2 | detection - models - text - language - model | 21 | 2_detection_models_text_language |
45
+ | 3 | code - language - llms - models - programming | 32 | 3_code_language_llms_models |
46
+ | 4 | agents - policy - language - learning - tasks | 49 | 4_agents_policy_language_learning |
47
+ | 5 | reasoning - cot - problems - models - commonsense | 22 | 5_reasoning_cot_problems_models |
48
+ | 6 | retrieval - information - query - llms - queries | 19 | 6_retrieval_information_query_llms |
49
+ | 7 | ai - models - language - dialogue - human | 15 | 7_ai_models_language_dialogue |
50
+ | 8 | language - models - translation - model - language models | 47 | 8_language_models_translation_model |
51
+ | 9 | distillation - model - knowledge - pretrained - student | 51 | 9_distillation_model_knowledge_pretrained |
52
+ | 10 | training - model - models - transformer - language | 16 | 10_training_model_models_transformer |
53
+
54
+ </details>
55
+
56
+ ## Training hyperparameters
57
+
58
+ * calculate_probabilities: False
59
+ * language: english
60
+ * low_memory: False
61
+ * min_topic_size: 10
62
+ * n_gram_range: (1, 1)
63
+ * nr_topics: 20
64
+ * seed_topic_list: None
65
+ * top_n_words: 10
66
+ * verbose: True
67
+ * zeroshot_min_similarity: 0.7
68
+ * zeroshot_topic_list: None
69
+
70
+ ## Framework versions
71
+
72
+ * Numpy: 1.25.2
73
+ * HDBSCAN: 0.8.33
74
+ * UMAP: 0.5.5
75
+ * Pandas: 1.5.3
76
+ * Scikit-Learn: 1.2.2
77
+ * Sentence-transformers: 2.6.1
78
+ * Transformers: 4.38.2
79
+ * Numba: 0.58.1
80
+ * Plotly: 5.15.0
81
+ * Python: 3.10.12
config.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "calculate_probabilities": false,
3
+ "language": "english",
4
+ "low_memory": false,
5
+ "min_topic_size": 10,
6
+ "n_gram_range": [
7
+ 1,
8
+ 1
9
+ ],
10
+ "nr_topics": 20,
11
+ "seed_topic_list": null,
12
+ "top_n_words": 10,
13
+ "verbose": true,
14
+ "zeroshot_min_similarity": 0.7,
15
+ "zeroshot_topic_list": null,
16
+ "embedding_model": "sentence-transformers/all-MiniLM-L6-v2"
17
+ }
ctfidf.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:22ea0d7033415f0d68f6e5b3563bed34d3c00987d2b635e6211ee2d6cd012bc3
3
+ size 343171
ctfidf_config.json ADDED
The diff for this file is too large to render. See raw diff
 
topic_embeddings.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:29deb3b6f42441c41506dbeced18ce744b1a821a350aadaa3e1af29304b1edc6
3
+ size 19721
topics.json ADDED
@@ -0,0 +1,2626 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "topic_representations": {
3
+ "-1": [
4
+ [
5
+ "models",
6
+ 0.03389659414835857
7
+ ],
8
+ [
9
+ "language",
10
+ 0.02862851018591238
11
+ ],
12
+ [
13
+ "data",
14
+ 0.025761470494060225
15
+ ],
16
+ [
17
+ "large",
18
+ 0.022745338330438167
19
+ ],
20
+ [
21
+ "language models",
22
+ 0.02139331315319328
23
+ ],
24
+ [
25
+ "tasks",
26
+ 0.019900039131413454
27
+ ],
28
+ [
29
+ "model",
30
+ 0.019776504965187965
31
+ ],
32
+ [
33
+ "llms",
34
+ 0.01898021896221333
35
+ ],
36
+ [
37
+ "large language",
38
+ 0.018796014150363968
39
+ ],
40
+ [
41
+ "large language models",
42
+ 0.01803230842296634
43
+ ]
44
+ ],
45
+ "0": [
46
+ [
47
+ "multimodal",
48
+ 0.06136737485351937
49
+ ],
50
+ [
51
+ "visual",
52
+ 0.05853719763492334
53
+ ],
54
+ [
55
+ "image",
56
+ 0.0485116842448941
57
+ ],
58
+ [
59
+ "models",
60
+ 0.04071455273538424
61
+ ],
62
+ [
63
+ "generation",
64
+ 0.03666106634697344
65
+ ],
66
+ [
67
+ "video",
68
+ 0.03347983737726034
69
+ ],
70
+ [
71
+ "understanding",
72
+ 0.030394954956701734
73
+ ],
74
+ [
75
+ "large",
76
+ 0.02799241301244745
77
+ ],
78
+ [
79
+ "instruction",
80
+ 0.026460918679033146
81
+ ],
82
+ [
83
+ "model",
84
+ 0.025928671364185387
85
+ ]
86
+ ],
87
+ "1": [
88
+ [
89
+ "speech",
90
+ 0.11570004480898892
91
+ ],
92
+ [
93
+ "asr",
94
+ 0.07577068396313229
95
+ ],
96
+ [
97
+ "text",
98
+ 0.045675035457062994
99
+ ],
100
+ [
101
+ "speaker",
102
+ 0.04413188327842426
103
+ ],
104
+ [
105
+ "recognition",
106
+ 0.0425835487386093
107
+ ],
108
+ [
109
+ "speech recognition",
110
+ 0.03380651953632135
111
+ ],
112
+ [
113
+ "model",
114
+ 0.030672757538167814
115
+ ],
116
+ [
117
+ "voice",
118
+ 0.030005193176805063
119
+ ],
120
+ [
121
+ "language",
122
+ 0.028788932413434717
123
+ ],
124
+ [
125
+ "proposed",
126
+ 0.028393525385724573
127
+ ]
128
+ ],
129
+ "2": [
130
+ [
131
+ "detection",
132
+ 0.04428034234306033
133
+ ],
134
+ [
135
+ "models",
136
+ 0.034572645259843875
137
+ ],
138
+ [
139
+ "text",
140
+ 0.034327288774788946
141
+ ],
142
+ [
143
+ "language",
144
+ 0.032811332249369544
145
+ ],
146
+ [
147
+ "model",
148
+ 0.027098162035851945
149
+ ],
150
+ [
151
+ "large",
152
+ 0.02498863472584851
153
+ ],
154
+ [
155
+ "language models",
156
+ 0.02455132759078857
157
+ ],
158
+ [
159
+ "misinformation",
160
+ 0.021777813891068417
161
+ ],
162
+ [
163
+ "dataset",
164
+ 0.020171781105023295
165
+ ],
166
+ [
167
+ "large language",
168
+ 0.019042601273926114
169
+ ]
170
+ ],
171
+ "3": [
172
+ [
173
+ "code",
174
+ 0.0770374212782715
175
+ ],
176
+ [
177
+ "language",
178
+ 0.03246858995960753
179
+ ],
180
+ [
181
+ "llms",
182
+ 0.0314836868548782
183
+ ],
184
+ [
185
+ "models",
186
+ 0.03135048850848837
187
+ ],
188
+ [
189
+ "programming",
190
+ 0.031032804646709017
191
+ ],
192
+ [
193
+ "software",
194
+ 0.02338912467302265
195
+ ],
196
+ [
197
+ "language models",
198
+ 0.021984864229269385
199
+ ],
200
+ [
201
+ "tasks",
202
+ 0.020079101943328282
203
+ ],
204
+ [
205
+ "model",
206
+ 0.019589308762890383
207
+ ],
208
+ [
209
+ "large language",
210
+ 0.01903123933999179
211
+ ]
212
+ ],
213
+ "4": [
214
+ [
215
+ "agents",
216
+ 0.03144148115888399
217
+ ],
218
+ [
219
+ "policy",
220
+ 0.030578201460243676
221
+ ],
222
+ [
223
+ "language",
224
+ 0.02958531504638217
225
+ ],
226
+ [
227
+ "learning",
228
+ 0.028956274103493298
229
+ ],
230
+ [
231
+ "tasks",
232
+ 0.02750394011875191
233
+ ],
234
+ [
235
+ "llms",
236
+ 0.02635994576153496
237
+ ],
238
+ [
239
+ "agent",
240
+ 0.025182048261377144
241
+ ],
242
+ [
243
+ "games",
244
+ 0.024542318750706546
245
+ ],
246
+ [
247
+ "knowledge",
248
+ 0.02370911390325935
249
+ ],
250
+ [
251
+ "model",
252
+ 0.022937914034634292
253
+ ]
254
+ ],
255
+ "5": [
256
+ [
257
+ "reasoning",
258
+ 0.0929222616473664
259
+ ],
260
+ [
261
+ "cot",
262
+ 0.040278262754808086
263
+ ],
264
+ [
265
+ "problems",
266
+ 0.03708374749407663
267
+ ],
268
+ [
269
+ "models",
270
+ 0.037066520645951874
271
+ ],
272
+ [
273
+ "commonsense",
274
+ 0.03179862849063796
275
+ ],
276
+ [
277
+ "prompting",
278
+ 0.029329865166081277
279
+ ],
280
+ [
281
+ "language",
282
+ 0.028264168425802164
283
+ ],
284
+ [
285
+ "language models",
286
+ 0.0264493153712725
287
+ ],
288
+ [
289
+ "math",
290
+ 0.025706499789005296
291
+ ],
292
+ [
293
+ "chainofthought",
294
+ 0.025706499789005296
295
+ ]
296
+ ],
297
+ "6": [
298
+ [
299
+ "retrieval",
300
+ 0.05202021714558848
301
+ ],
302
+ [
303
+ "information",
304
+ 0.03944643968452574
305
+ ],
306
+ [
307
+ "query",
308
+ 0.03862936737060072
309
+ ],
310
+ [
311
+ "llms",
312
+ 0.03381477714650923
313
+ ],
314
+ [
315
+ "queries",
316
+ 0.030743284058903933
317
+ ],
318
+ [
319
+ "models",
320
+ 0.029746047746957945
321
+ ],
322
+ [
323
+ "language",
324
+ 0.029551563260931873
325
+ ],
326
+ [
327
+ "language models",
328
+ 0.026223664397697355
329
+ ],
330
+ [
331
+ "large",
332
+ 0.02485613931372842
333
+ ],
334
+ [
335
+ "information retrieval",
336
+ 0.023226595334291323
337
+ ]
338
+ ],
339
+ "7": [
340
+ [
341
+ "ai",
342
+ 0.035714494707298816
343
+ ],
344
+ [
345
+ "models",
346
+ 0.02952586052201538
347
+ ],
348
+ [
349
+ "language",
350
+ 0.02834496901204038
351
+ ],
352
+ [
353
+ "dialogue",
354
+ 0.027651780633986222
355
+ ],
356
+ [
357
+ "human",
358
+ 0.026401782297458473
359
+ ],
360
+ [
361
+ "llms",
362
+ 0.025442822490930102
363
+ ],
364
+ [
365
+ "chatgpt",
366
+ 0.02318990727379372
367
+ ],
368
+ [
369
+ "large language",
370
+ 0.02271947226012763
371
+ ],
372
+ [
373
+ "large",
374
+ 0.021957413569739143
375
+ ],
376
+ [
377
+ "model",
378
+ 0.02070839524294738
379
+ ]
380
+ ],
381
+ "8": [
382
+ [
383
+ "language",
384
+ 0.04093864302301298
385
+ ],
386
+ [
387
+ "models",
388
+ 0.03595221175799092
389
+ ],
390
+ [
391
+ "translation",
392
+ 0.031712613088874894
393
+ ],
394
+ [
395
+ "model",
396
+ 0.030177929651754233
397
+ ],
398
+ [
399
+ "language models",
400
+ 0.026247976024177194
401
+ ],
402
+ [
403
+ "text",
404
+ 0.024834259305576166
405
+ ],
406
+ [
407
+ "data",
408
+ 0.02462670002569503
409
+ ],
410
+ [
411
+ "generation",
412
+ 0.020743602919543393
413
+ ],
414
+ [
415
+ "tasks",
416
+ 0.020568403779006268
417
+ ],
418
+ [
419
+ "machine translation",
420
+ 0.019130933056569405
421
+ ]
422
+ ],
423
+ "9": [
424
+ [
425
+ "distillation",
426
+ 0.04337789490301995
427
+ ],
428
+ [
429
+ "model",
430
+ 0.040261980975691315
431
+ ],
432
+ [
433
+ "knowledge",
434
+ 0.03986242788324582
435
+ ],
436
+ [
437
+ "pretrained",
438
+ 0.039810767531247584
439
+ ],
440
+ [
441
+ "student",
442
+ 0.03578735650250997
443
+ ],
444
+ [
445
+ "models",
446
+ 0.03577800012735637
447
+ ],
448
+ [
449
+ "teacher",
450
+ 0.034995506692116485
451
+ ],
452
+ [
453
+ "30",
454
+ 0.03383519763051433
455
+ ],
456
+ [
457
+ "pretraining",
458
+ 0.030341356455441396
459
+ ],
460
+ [
461
+ "language",
462
+ 0.029777618334555132
463
+ ]
464
+ ],
465
+ "10": [
466
+ [
467
+ "training",
468
+ 0.039846773292934345
469
+ ],
470
+ [
471
+ "model",
472
+ 0.03354112714562384
473
+ ],
474
+ [
475
+ "models",
476
+ 0.03309176444172136
477
+ ],
478
+ [
479
+ "transformer",
480
+ 0.02791942196230748
481
+ ],
482
+ [
483
+ "language",
484
+ 0.024257026345120718
485
+ ],
486
+ [
487
+ "finetuning",
488
+ 0.022555042408780118
489
+ ],
490
+ [
491
+ "large",
492
+ 0.02231168487342353
493
+ ],
494
+ [
495
+ "quantization",
496
+ 0.021953720153927197
497
+ ],
498
+ [
499
+ "transformers",
500
+ 0.02143379388468265
501
+ ],
502
+ [
503
+ "tasks",
504
+ 0.020718276629461102
505
+ ]
506
+ ]
507
+ },
508
+ "topics": [
509
+ 8,
510
+ 2,
511
+ 1,
512
+ -1,
513
+ 10,
514
+ -1,
515
+ 8,
516
+ 10,
517
+ -1,
518
+ 10,
519
+ 9,
520
+ -1,
521
+ 8,
522
+ 8,
523
+ 3,
524
+ -1,
525
+ 8,
526
+ -1,
527
+ -1,
528
+ 5,
529
+ -1,
530
+ 8,
531
+ 10,
532
+ -1,
533
+ 9,
534
+ 6,
535
+ 9,
536
+ -1,
537
+ -1,
538
+ 7,
539
+ 7,
540
+ 6,
541
+ 8,
542
+ 7,
543
+ 10,
544
+ 10,
545
+ 4,
546
+ 6,
547
+ 8,
548
+ 8,
549
+ 10,
550
+ 8,
551
+ 8,
552
+ 10,
553
+ 7,
554
+ -1,
555
+ 2,
556
+ 7,
557
+ 2,
558
+ 8,
559
+ 1,
560
+ -1,
561
+ 2,
562
+ -1,
563
+ 7,
564
+ 10,
565
+ 10,
566
+ -1,
567
+ 3,
568
+ 10,
569
+ 2,
570
+ 3,
571
+ 9,
572
+ 3,
573
+ 9,
574
+ 10,
575
+ 8,
576
+ 7,
577
+ 9,
578
+ 8,
579
+ 8,
580
+ -1,
581
+ -1,
582
+ 1,
583
+ -1,
584
+ -1,
585
+ 7,
586
+ 2,
587
+ 8,
588
+ 10,
589
+ 10,
590
+ 7,
591
+ 9,
592
+ 5,
593
+ 8,
594
+ -1,
595
+ 1,
596
+ 2,
597
+ 7,
598
+ -1,
599
+ -1,
600
+ 8,
601
+ 10,
602
+ -1,
603
+ 2,
604
+ 9,
605
+ 0,
606
+ 8,
607
+ 3,
608
+ 2,
609
+ 9,
610
+ 10,
611
+ 10,
612
+ 3,
613
+ 10,
614
+ 5,
615
+ 7,
616
+ -1,
617
+ 1,
618
+ 4,
619
+ -1,
620
+ -1,
621
+ 8,
622
+ 9,
623
+ -1,
624
+ 8,
625
+ 7,
626
+ -1,
627
+ 8,
628
+ 8,
629
+ 5,
630
+ 10,
631
+ -1,
632
+ 2,
633
+ 3,
634
+ -1,
635
+ -1,
636
+ 2,
637
+ 4,
638
+ 5,
639
+ 1,
640
+ 6,
641
+ 8,
642
+ 4,
643
+ 3,
644
+ 3,
645
+ 3,
646
+ 3,
647
+ 4,
648
+ 8,
649
+ -1,
650
+ -1,
651
+ 4,
652
+ 2,
653
+ 2,
654
+ -1,
655
+ -1,
656
+ 4,
657
+ 3,
658
+ 2,
659
+ 8,
660
+ -1,
661
+ -1,
662
+ 9,
663
+ 10,
664
+ 8,
665
+ 5,
666
+ -1,
667
+ 2,
668
+ 5,
669
+ 8,
670
+ -1,
671
+ 1,
672
+ 6,
673
+ -1,
674
+ 10,
675
+ -1,
676
+ 10,
677
+ 10,
678
+ -1,
679
+ 5,
680
+ 8,
681
+ 3,
682
+ 0,
683
+ 10,
684
+ 5,
685
+ 8,
686
+ -1,
687
+ 3,
688
+ -1,
689
+ 7,
690
+ -1,
691
+ 10,
692
+ -1,
693
+ -1,
694
+ 7,
695
+ 9,
696
+ -1,
697
+ 8,
698
+ -1,
699
+ 10,
700
+ -1,
701
+ 7,
702
+ -1,
703
+ -1,
704
+ 2,
705
+ 6,
706
+ 2,
707
+ 1,
708
+ -1,
709
+ -1,
710
+ 7,
711
+ 0,
712
+ 8,
713
+ 1,
714
+ -1,
715
+ 0,
716
+ -1,
717
+ 5,
718
+ 7,
719
+ 0,
720
+ -1,
721
+ 6,
722
+ -1,
723
+ 10,
724
+ 10,
725
+ 0,
726
+ 10,
727
+ 7,
728
+ 7,
729
+ -1,
730
+ 2,
731
+ -1,
732
+ -1,
733
+ -1,
734
+ -1,
735
+ 6,
736
+ -1,
737
+ 10,
738
+ 0,
739
+ -1,
740
+ -1,
741
+ 3,
742
+ 3,
743
+ 3,
744
+ 6,
745
+ 10,
746
+ 3,
747
+ 7,
748
+ -1,
749
+ 3,
750
+ -1,
751
+ 7,
752
+ 7,
753
+ -1,
754
+ 0,
755
+ 7,
756
+ -1,
757
+ 10,
758
+ 1,
759
+ -1,
760
+ 0,
761
+ -1,
762
+ -1,
763
+ -1,
764
+ 5,
765
+ -1,
766
+ -1,
767
+ 8,
768
+ -1,
769
+ 3,
770
+ -1,
771
+ 8,
772
+ -1,
773
+ 10,
774
+ -1,
775
+ 7,
776
+ 3,
777
+ 7,
778
+ 8,
779
+ -1,
780
+ -1,
781
+ -1,
782
+ 7,
783
+ 8,
784
+ 7,
785
+ -1,
786
+ 6,
787
+ 5,
788
+ 8,
789
+ 3,
790
+ 4,
791
+ -1,
792
+ -1,
793
+ -1,
794
+ 8,
795
+ 7,
796
+ 8,
797
+ -1,
798
+ 9,
799
+ -1,
800
+ -1,
801
+ -1,
802
+ 2,
803
+ 7,
804
+ -1,
805
+ 5,
806
+ -1,
807
+ 8,
808
+ 8,
809
+ -1,
810
+ -1,
811
+ 3,
812
+ -1,
813
+ -1,
814
+ 8,
815
+ 3,
816
+ 2,
817
+ 5,
818
+ 3,
819
+ -1,
820
+ 9,
821
+ -1,
822
+ 8,
823
+ -1,
824
+ -1,
825
+ 10,
826
+ -1,
827
+ -1,
828
+ 9,
829
+ 4,
830
+ -1,
831
+ 3,
832
+ 10,
833
+ 3,
834
+ 6,
835
+ 10,
836
+ 7,
837
+ 3,
838
+ -1,
839
+ 3,
840
+ 4,
841
+ 10,
842
+ -1,
843
+ 0,
844
+ 3,
845
+ 3,
846
+ 10,
847
+ -1,
848
+ -1,
849
+ 7,
850
+ 0,
851
+ -1,
852
+ 10,
853
+ 10,
854
+ -1,
855
+ 7,
856
+ 8,
857
+ -1,
858
+ 7,
859
+ 3,
860
+ 4,
861
+ 2,
862
+ 1,
863
+ 4,
864
+ 7,
865
+ 3,
866
+ 0,
867
+ 4,
868
+ -1,
869
+ -1,
870
+ 7,
871
+ -1,
872
+ 1,
873
+ 10,
874
+ 7,
875
+ -1,
876
+ -1,
877
+ -1,
878
+ 2,
879
+ 0,
880
+ 0,
881
+ -1,
882
+ 3,
883
+ -1,
884
+ 1,
885
+ -1,
886
+ -1,
887
+ 3,
888
+ -1,
889
+ 4,
890
+ -1,
891
+ 0,
892
+ 3,
893
+ 0,
894
+ -1,
895
+ 8,
896
+ 10,
897
+ -1,
898
+ -1,
899
+ 1,
900
+ 4,
901
+ 7,
902
+ -1,
903
+ -1,
904
+ -1,
905
+ -1,
906
+ -1,
907
+ -1,
908
+ 0,
909
+ -1,
910
+ -1,
911
+ -1,
912
+ -1,
913
+ 4,
914
+ -1,
915
+ -1,
916
+ 8,
917
+ -1,
918
+ 7,
919
+ 2,
920
+ 3,
921
+ 7,
922
+ -1,
923
+ 3,
924
+ 5,
925
+ -1,
926
+ 0,
927
+ -1,
928
+ 3,
929
+ 2,
930
+ -1,
931
+ 6,
932
+ 8,
933
+ 3,
934
+ -1,
935
+ 10,
936
+ 3,
937
+ 10,
938
+ 0,
939
+ 6,
940
+ -1,
941
+ 2,
942
+ -1,
943
+ 0,
944
+ 0,
945
+ 7,
946
+ 4,
947
+ 6,
948
+ 2,
949
+ 5,
950
+ 2,
951
+ 10,
952
+ 3,
953
+ 6,
954
+ -1,
955
+ 1,
956
+ 0,
957
+ 8,
958
+ 5,
959
+ -1,
960
+ 1,
961
+ 0,
962
+ -1,
963
+ 1,
964
+ -1,
965
+ 10,
966
+ -1,
967
+ -1,
968
+ 5,
969
+ 3,
970
+ 2,
971
+ -1,
972
+ 10,
973
+ 1,
974
+ -1,
975
+ 3,
976
+ 7,
977
+ 2,
978
+ 7,
979
+ 3,
980
+ 4,
981
+ 8,
982
+ -1,
983
+ 1,
984
+ -1,
985
+ 10,
986
+ 9,
987
+ 3,
988
+ 1,
989
+ 4,
990
+ -1,
991
+ 8,
992
+ 7,
993
+ -1,
994
+ -1,
995
+ 8,
996
+ 2,
997
+ 10,
998
+ 7,
999
+ 2,
1000
+ 7,
1001
+ 7,
1002
+ 5,
1003
+ 3,
1004
+ -1,
1005
+ 3,
1006
+ 3,
1007
+ -1,
1008
+ 2,
1009
+ -1,
1010
+ 1,
1011
+ 10,
1012
+ 0,
1013
+ 10,
1014
+ 4,
1015
+ -1,
1016
+ -1,
1017
+ -1,
1018
+ -1,
1019
+ 4,
1020
+ 4,
1021
+ -1,
1022
+ 7,
1023
+ -1,
1024
+ -1
1025
+ ],
1026
+ "topic_sizes": {
1027
+ "8": 51,
1028
+ "2": 32,
1029
+ "1": 21,
1030
+ "-1": 169,
1031
+ "10": 51,
1032
+ "9": 16,
1033
+ "3": 49,
1034
+ "5": 19,
1035
+ "6": 15,
1036
+ "7": 47,
1037
+ "4": 22,
1038
+ "0": 24
1039
+ },
1040
+ "topic_mapper": [
1041
+ [
1042
+ -1,
1043
+ -1
1044
+ ],
1045
+ [
1046
+ 0,
1047
+ 0
1048
+ ],
1049
+ [
1050
+ 1,
1051
+ 1
1052
+ ],
1053
+ [
1054
+ 2,
1055
+ 2
1056
+ ],
1057
+ [
1058
+ 3,
1059
+ 3
1060
+ ],
1061
+ [
1062
+ 4,
1063
+ 4
1064
+ ],
1065
+ [
1066
+ 5,
1067
+ 5
1068
+ ],
1069
+ [
1070
+ 6,
1071
+ 6
1072
+ ],
1073
+ [
1074
+ 7,
1075
+ 7
1076
+ ],
1077
+ [
1078
+ 8,
1079
+ 8
1080
+ ],
1081
+ [
1082
+ 9,
1083
+ 9
1084
+ ],
1085
+ [
1086
+ 10,
1087
+ 10
1088
+ ]
1089
+ ],
1090
+ "topic_labels": {
1091
+ "-1": "-1_models_language_data_large",
1092
+ "0": "0_multimodal_visual_image_models",
1093
+ "1": "1_speech_asr_text_speaker",
1094
+ "2": "2_detection_models_text_language",
1095
+ "3": "3_code_language_llms_models",
1096
+ "4": "4_agents_policy_language_learning",
1097
+ "5": "5_reasoning_cot_problems_models",
1098
+ "6": "6_retrieval_information_query_llms",
1099
+ "7": "7_ai_models_language_dialogue",
1100
+ "8": "8_language_models_translation_model",
1101
+ "9": "9_distillation_model_knowledge_pretrained",
1102
+ "10": "10_training_model_models_transformer"
1103
+ },
1104
+ "custom_labels": null,
1105
+ "_outliers": 1,
1106
+ "topic_aspects": {
1107
+ "KeyBERT": {
1108
+ "-1": [
1109
+ [
1110
+ "large language models",
1111
+ 0.6703740358352661
1112
+ ],
1113
+ [
1114
+ "large language models llms",
1115
+ 0.6190639734268188
1116
+ ],
1117
+ [
1118
+ "language models",
1119
+ 0.6147422790527344
1120
+ ],
1121
+ [
1122
+ "language models llms",
1123
+ 0.567597508430481
1124
+ ],
1125
+ [
1126
+ "language model",
1127
+ 0.5490379333496094
1128
+ ],
1129
+ [
1130
+ "large language",
1131
+ 0.47846221923828125
1132
+ ],
1133
+ [
1134
+ "natural language",
1135
+ 0.47019103169441223
1136
+ ],
1137
+ [
1138
+ "semantic",
1139
+ 0.3743295669555664
1140
+ ],
1141
+ [
1142
+ "language",
1143
+ 0.36398619413375854
1144
+ ],
1145
+ [
1146
+ "training data",
1147
+ 0.36353152990341187
1148
+ ]
1149
+ ],
1150
+ "0": [
1151
+ [
1152
+ "multimodal large language",
1153
+ 0.6466671228408813
1154
+ ],
1155
+ [
1156
+ "multimodal models",
1157
+ 0.63934326171875
1158
+ ],
1159
+ [
1160
+ "multimodal",
1161
+ 0.6179039478302002
1162
+ ],
1163
+ [
1164
+ "multimodal large",
1165
+ 0.5376994609832764
1166
+ ],
1167
+ [
1168
+ "visual",
1169
+ 0.47933536767959595
1170
+ ],
1171
+ [
1172
+ "large language models",
1173
+ 0.4537416994571686
1174
+ ],
1175
+ [
1176
+ "visionlanguage",
1177
+ 0.4349161982536316
1178
+ ],
1179
+ [
1180
+ "language models",
1181
+ 0.42795825004577637
1182
+ ],
1183
+ [
1184
+ "large language model",
1185
+ 0.4277690649032593
1186
+ ],
1187
+ [
1188
+ "visual foundation models",
1189
+ 0.40677303075790405
1190
+ ]
1191
+ ],
1192
+ "1": [
1193
+ [
1194
+ "automatic speech",
1195
+ 0.6949269771575928
1196
+ ],
1197
+ [
1198
+ "automatic speech recognition asr",
1199
+ 0.6262308359146118
1200
+ ],
1201
+ [
1202
+ "speech recognition asr",
1203
+ 0.5822510123252869
1204
+ ],
1205
+ [
1206
+ "automatic speech recognition",
1207
+ 0.573049783706665
1208
+ ],
1209
+ [
1210
+ "speech recognition",
1211
+ 0.5546950697898865
1212
+ ],
1213
+ [
1214
+ "utterances",
1215
+ 0.5278962850570679
1216
+ ],
1217
+ [
1218
+ "large language models",
1219
+ 0.5129837989807129
1220
+ ],
1221
+ [
1222
+ "large language model",
1223
+ 0.4912102520465851
1224
+ ],
1225
+ [
1226
+ "language models",
1227
+ 0.47036200761795044
1228
+ ],
1229
+ [
1230
+ "speech",
1231
+ 0.44434642791748047
1232
+ ]
1233
+ ],
1234
+ "2": [
1235
+ [
1236
+ "large language models",
1237
+ 0.5753244161605835
1238
+ ],
1239
+ [
1240
+ "large language models llms",
1241
+ 0.5593785047531128
1242
+ ],
1243
+ [
1244
+ "language models",
1245
+ 0.5217305421829224
1246
+ ],
1247
+ [
1248
+ "language models llms",
1249
+ 0.5088766813278198
1250
+ ],
1251
+ [
1252
+ "machinegenerated text",
1253
+ 0.49884361028671265
1254
+ ],
1255
+ [
1256
+ "language model",
1257
+ 0.45426321029663086
1258
+ ],
1259
+ [
1260
+ "large language",
1261
+ 0.4042874574661255
1262
+ ],
1263
+ [
1264
+ "texts",
1265
+ 0.3673853874206543
1266
+ ],
1267
+ [
1268
+ "classifier",
1269
+ 0.354655921459198
1270
+ ],
1271
+ [
1272
+ "text",
1273
+ 0.3459568917751312
1274
+ ]
1275
+ ],
1276
+ "3": [
1277
+ [
1278
+ "code generation",
1279
+ 0.5884342193603516
1280
+ ],
1281
+ [
1282
+ "code completion",
1283
+ 0.5430148243904114
1284
+ ],
1285
+ [
1286
+ "source code",
1287
+ 0.5036313533782959
1288
+ ],
1289
+ [
1290
+ "large language models",
1291
+ 0.4955923557281494
1292
+ ],
1293
+ [
1294
+ "large language models llms",
1295
+ 0.48612886667251587
1296
+ ],
1297
+ [
1298
+ "language models",
1299
+ 0.44613736867904663
1300
+ ],
1301
+ [
1302
+ "software engineering",
1303
+ 0.44518738985061646
1304
+ ],
1305
+ [
1306
+ "language models llms",
1307
+ 0.44061604142189026
1308
+ ],
1309
+ [
1310
+ "programming",
1311
+ 0.41835474967956543
1312
+ ],
1313
+ [
1314
+ "coding",
1315
+ 0.4044495224952698
1316
+ ]
1317
+ ],
1318
+ "4": [
1319
+ [
1320
+ "large language models llms",
1321
+ 0.4626759886741638
1322
+ ],
1323
+ [
1324
+ "ai",
1325
+ 0.4613281488418579
1326
+ ],
1327
+ [
1328
+ "language models llms",
1329
+ 0.45701661705970764
1330
+ ],
1331
+ [
1332
+ "agent",
1333
+ 0.4489193260669708
1334
+ ],
1335
+ [
1336
+ "large language models",
1337
+ 0.4476342499256134
1338
+ ],
1339
+ [
1340
+ "agents",
1341
+ 0.44667837023735046
1342
+ ],
1343
+ [
1344
+ "interactive",
1345
+ 0.439677357673645
1346
+ ],
1347
+ [
1348
+ "language models",
1349
+ 0.4368625581264496
1350
+ ],
1351
+ [
1352
+ "reinforcement",
1353
+ 0.4350704252719879
1354
+ ],
1355
+ [
1356
+ "language model",
1357
+ 0.42887791991233826
1358
+ ]
1359
+ ],
1360
+ "5": [
1361
+ [
1362
+ "reasoning large language models",
1363
+ 0.6903330087661743
1364
+ ],
1365
+ [
1366
+ "reasoning tasks",
1367
+ 0.6320526599884033
1368
+ ],
1369
+ [
1370
+ "reasoning large language",
1371
+ 0.630852460861206
1372
+ ],
1373
+ [
1374
+ "reasoning capabilities",
1375
+ 0.6158041954040527
1376
+ ],
1377
+ [
1378
+ "reasoning benchmarks",
1379
+ 0.5364078283309937
1380
+ ],
1381
+ [
1382
+ "large language models",
1383
+ 0.48382118344306946
1384
+ ],
1385
+ [
1386
+ "large language models llms",
1387
+ 0.4739668369293213
1388
+ ],
1389
+ [
1390
+ "complex reasoning",
1391
+ 0.46622762084007263
1392
+ ],
1393
+ [
1394
+ "language models",
1395
+ 0.4620729982852936
1396
+ ],
1397
+ [
1398
+ "language models llms",
1399
+ 0.45314210653305054
1400
+ ]
1401
+ ],
1402
+ "6": [
1403
+ [
1404
+ "large language models llm",
1405
+ 0.6180689334869385
1406
+ ],
1407
+ [
1408
+ "large language models llms",
1409
+ 0.6018953323364258
1410
+ ],
1411
+ [
1412
+ "large language models",
1413
+ 0.5865136384963989
1414
+ ],
1415
+ [
1416
+ "language models llm",
1417
+ 0.5565091371536255
1418
+ ],
1419
+ [
1420
+ "language models llms",
1421
+ 0.5427589416503906
1422
+ ],
1423
+ [
1424
+ "language models",
1425
+ 0.505111813545227
1426
+ ],
1427
+ [
1428
+ "information retrieval",
1429
+ 0.5001325011253357
1430
+ ],
1431
+ [
1432
+ "retrieval",
1433
+ 0.46649330854415894
1434
+ ],
1435
+ [
1436
+ "knowledge bases",
1437
+ 0.4627561867237091
1438
+ ],
1439
+ [
1440
+ "large language",
1441
+ 0.3926961421966553
1442
+ ]
1443
+ ],
1444
+ "7": [
1445
+ [
1446
+ "conversational ai",
1447
+ 0.6492804884910583
1448
+ ],
1449
+ [
1450
+ "chatbots",
1451
+ 0.5619252324104309
1452
+ ],
1453
+ [
1454
+ "large language models",
1455
+ 0.5536242723464966
1456
+ ],
1457
+ [
1458
+ "large language models llms",
1459
+ 0.5412259101867676
1460
+ ],
1461
+ [
1462
+ "language models llms",
1463
+ 0.5045098066329956
1464
+ ],
1465
+ [
1466
+ "language models",
1467
+ 0.4986751079559326
1468
+ ],
1469
+ [
1470
+ "generative ai",
1471
+ 0.4693562090396881
1472
+ ],
1473
+ [
1474
+ "dialogues",
1475
+ 0.4594458043575287
1476
+ ],
1477
+ [
1478
+ "chatbot",
1479
+ 0.4492765963077545
1480
+ ],
1481
+ [
1482
+ "language model",
1483
+ 0.4488487243652344
1484
+ ]
1485
+ ],
1486
+ "8": [
1487
+ [
1488
+ "neural machine translation",
1489
+ 0.647374153137207
1490
+ ],
1491
+ [
1492
+ "machine translation",
1493
+ 0.622808575630188
1494
+ ],
1495
+ [
1496
+ "large language models",
1497
+ 0.5983676314353943
1498
+ ],
1499
+ [
1500
+ "language models",
1501
+ 0.48987895250320435
1502
+ ],
1503
+ [
1504
+ "translations",
1505
+ 0.4664888381958008
1506
+ ],
1507
+ [
1508
+ "large language",
1509
+ 0.44685059785842896
1510
+ ],
1511
+ [
1512
+ "language model",
1513
+ 0.4379696249961853
1514
+ ],
1515
+ [
1516
+ "largescale language",
1517
+ 0.43625307083129883
1518
+ ],
1519
+ [
1520
+ "text generation",
1521
+ 0.4185757339000702
1522
+ ],
1523
+ [
1524
+ "multilingual",
1525
+ 0.40777745842933655
1526
+ ]
1527
+ ],
1528
+ "9": [
1529
+ [
1530
+ "pretrained language models",
1531
+ 0.6431570053100586
1532
+ ],
1533
+ [
1534
+ "pretrained language",
1535
+ 0.5234094858169556
1536
+ ],
1537
+ [
1538
+ "knowledge distillation",
1539
+ 0.4824550151824951
1540
+ ],
1541
+ [
1542
+ "model pretraining",
1543
+ 0.47364121675491333
1544
+ ],
1545
+ [
1546
+ "nlp tasks",
1547
+ 0.4624066948890686
1548
+ ],
1549
+ [
1550
+ "language models",
1551
+ 0.45500046014785767
1552
+ ],
1553
+ [
1554
+ "language model",
1555
+ 0.42565417289733887
1556
+ ],
1557
+ [
1558
+ "language understanding generation",
1559
+ 0.4114922285079956
1560
+ ],
1561
+ [
1562
+ "transfer learning",
1563
+ 0.39377400279045105
1564
+ ],
1565
+ [
1566
+ "pretraining",
1567
+ 0.3853399157524109
1568
+ ]
1569
+ ],
1570
+ "10": [
1571
+ [
1572
+ "large language models",
1573
+ 0.6176133155822754
1574
+ ],
1575
+ [
1576
+ "large language",
1577
+ 0.47964465618133545
1578
+ ],
1579
+ [
1580
+ "language models",
1581
+ 0.45619314908981323
1582
+ ],
1583
+ [
1584
+ "memory",
1585
+ 0.4200182557106018
1586
+ ],
1587
+ [
1588
+ "sparse",
1589
+ 0.41314226388931274
1590
+ ],
1591
+ [
1592
+ "attention",
1593
+ 0.3642992377281189
1594
+ ],
1595
+ [
1596
+ "learning",
1597
+ 0.34689220786094666
1598
+ ],
1599
+ [
1600
+ "compression",
1601
+ 0.33176761865615845
1602
+ ],
1603
+ [
1604
+ "efficiently",
1605
+ 0.3242114186286926
1606
+ ],
1607
+ [
1608
+ "neural",
1609
+ 0.3096249997615814
1610
+ ]
1611
+ ]
1612
+ },
1613
+ "MMR": {
1614
+ "-1": [
1615
+ [
1616
+ "models",
1617
+ 0.03389659414835857
1618
+ ],
1619
+ [
1620
+ "language",
1621
+ 0.02862851018591238
1622
+ ],
1623
+ [
1624
+ "data",
1625
+ 0.025761470494060225
1626
+ ],
1627
+ [
1628
+ "large",
1629
+ 0.022745338330438167
1630
+ ],
1631
+ [
1632
+ "language models",
1633
+ 0.02139331315319328
1634
+ ],
1635
+ [
1636
+ "tasks",
1637
+ 0.019900039131413454
1638
+ ],
1639
+ [
1640
+ "model",
1641
+ 0.019776504965187965
1642
+ ],
1643
+ [
1644
+ "llms",
1645
+ 0.01898021896221333
1646
+ ],
1647
+ [
1648
+ "large language",
1649
+ 0.018796014150363968
1650
+ ],
1651
+ [
1652
+ "large language models",
1653
+ 0.01803230842296634
1654
+ ]
1655
+ ],
1656
+ "0": [
1657
+ [
1658
+ "multimodal",
1659
+ 0.06136737485351937
1660
+ ],
1661
+ [
1662
+ "visual",
1663
+ 0.05853719763492334
1664
+ ],
1665
+ [
1666
+ "image",
1667
+ 0.0485116842448941
1668
+ ],
1669
+ [
1670
+ "models",
1671
+ 0.04071455273538424
1672
+ ],
1673
+ [
1674
+ "generation",
1675
+ 0.03666106634697344
1676
+ ],
1677
+ [
1678
+ "video",
1679
+ 0.03347983737726034
1680
+ ],
1681
+ [
1682
+ "understanding",
1683
+ 0.030394954956701734
1684
+ ],
1685
+ [
1686
+ "large",
1687
+ 0.02799241301244745
1688
+ ],
1689
+ [
1690
+ "instruction",
1691
+ 0.026460918679033146
1692
+ ],
1693
+ [
1694
+ "model",
1695
+ 0.025928671364185387
1696
+ ]
1697
+ ],
1698
+ "1": [
1699
+ [
1700
+ "speech",
1701
+ 0.11570004480898892
1702
+ ],
1703
+ [
1704
+ "asr",
1705
+ 0.07577068396313229
1706
+ ],
1707
+ [
1708
+ "text",
1709
+ 0.045675035457062994
1710
+ ],
1711
+ [
1712
+ "speaker",
1713
+ 0.04413188327842426
1714
+ ],
1715
+ [
1716
+ "recognition",
1717
+ 0.0425835487386093
1718
+ ],
1719
+ [
1720
+ "speech recognition",
1721
+ 0.03380651953632135
1722
+ ],
1723
+ [
1724
+ "model",
1725
+ 0.030672757538167814
1726
+ ],
1727
+ [
1728
+ "voice",
1729
+ 0.030005193176805063
1730
+ ],
1731
+ [
1732
+ "language",
1733
+ 0.028788932413434717
1734
+ ],
1735
+ [
1736
+ "proposed",
1737
+ 0.028393525385724573
1738
+ ]
1739
+ ],
1740
+ "2": [
1741
+ [
1742
+ "detection",
1743
+ 0.04428034234306033
1744
+ ],
1745
+ [
1746
+ "models",
1747
+ 0.034572645259843875
1748
+ ],
1749
+ [
1750
+ "text",
1751
+ 0.034327288774788946
1752
+ ],
1753
+ [
1754
+ "language",
1755
+ 0.032811332249369544
1756
+ ],
1757
+ [
1758
+ "model",
1759
+ 0.027098162035851945
1760
+ ],
1761
+ [
1762
+ "large",
1763
+ 0.02498863472584851
1764
+ ],
1765
+ [
1766
+ "language models",
1767
+ 0.02455132759078857
1768
+ ],
1769
+ [
1770
+ "misinformation",
1771
+ 0.021777813891068417
1772
+ ],
1773
+ [
1774
+ "dataset",
1775
+ 0.020171781105023295
1776
+ ],
1777
+ [
1778
+ "large language",
1779
+ 0.019042601273926114
1780
+ ]
1781
+ ],
1782
+ "3": [
1783
+ [
1784
+ "code",
1785
+ 0.0770374212782715
1786
+ ],
1787
+ [
1788
+ "language",
1789
+ 0.03246858995960753
1790
+ ],
1791
+ [
1792
+ "llms",
1793
+ 0.0314836868548782
1794
+ ],
1795
+ [
1796
+ "models",
1797
+ 0.03135048850848837
1798
+ ],
1799
+ [
1800
+ "programming",
1801
+ 0.031032804646709017
1802
+ ],
1803
+ [
1804
+ "software",
1805
+ 0.02338912467302265
1806
+ ],
1807
+ [
1808
+ "language models",
1809
+ 0.021984864229269385
1810
+ ],
1811
+ [
1812
+ "tasks",
1813
+ 0.020079101943328282
1814
+ ],
1815
+ [
1816
+ "model",
1817
+ 0.019589308762890383
1818
+ ],
1819
+ [
1820
+ "large language",
1821
+ 0.01903123933999179
1822
+ ]
1823
+ ],
1824
+ "4": [
1825
+ [
1826
+ "agents",
1827
+ 0.03144148115888399
1828
+ ],
1829
+ [
1830
+ "policy",
1831
+ 0.030578201460243676
1832
+ ],
1833
+ [
1834
+ "language",
1835
+ 0.02958531504638217
1836
+ ],
1837
+ [
1838
+ "learning",
1839
+ 0.028956274103493298
1840
+ ],
1841
+ [
1842
+ "tasks",
1843
+ 0.02750394011875191
1844
+ ],
1845
+ [
1846
+ "llms",
1847
+ 0.02635994576153496
1848
+ ],
1849
+ [
1850
+ "agent",
1851
+ 0.025182048261377144
1852
+ ],
1853
+ [
1854
+ "games",
1855
+ 0.024542318750706546
1856
+ ],
1857
+ [
1858
+ "knowledge",
1859
+ 0.02370911390325935
1860
+ ],
1861
+ [
1862
+ "model",
1863
+ 0.022937914034634292
1864
+ ]
1865
+ ],
1866
+ "5": [
1867
+ [
1868
+ "reasoning",
1869
+ 0.0929222616473664
1870
+ ],
1871
+ [
1872
+ "cot",
1873
+ 0.040278262754808086
1874
+ ],
1875
+ [
1876
+ "problems",
1877
+ 0.03708374749407663
1878
+ ],
1879
+ [
1880
+ "models",
1881
+ 0.037066520645951874
1882
+ ],
1883
+ [
1884
+ "commonsense",
1885
+ 0.03179862849063796
1886
+ ],
1887
+ [
1888
+ "prompting",
1889
+ 0.029329865166081277
1890
+ ],
1891
+ [
1892
+ "language",
1893
+ 0.028264168425802164
1894
+ ],
1895
+ [
1896
+ "language models",
1897
+ 0.0264493153712725
1898
+ ],
1899
+ [
1900
+ "math",
1901
+ 0.025706499789005296
1902
+ ],
1903
+ [
1904
+ "chainofthought",
1905
+ 0.025706499789005296
1906
+ ]
1907
+ ],
1908
+ "6": [
1909
+ [
1910
+ "retrieval",
1911
+ 0.05202021714558848
1912
+ ],
1913
+ [
1914
+ "information",
1915
+ 0.03944643968452574
1916
+ ],
1917
+ [
1918
+ "query",
1919
+ 0.03862936737060072
1920
+ ],
1921
+ [
1922
+ "llms",
1923
+ 0.03381477714650923
1924
+ ],
1925
+ [
1926
+ "queries",
1927
+ 0.030743284058903933
1928
+ ],
1929
+ [
1930
+ "models",
1931
+ 0.029746047746957945
1932
+ ],
1933
+ [
1934
+ "language",
1935
+ 0.029551563260931873
1936
+ ],
1937
+ [
1938
+ "language models",
1939
+ 0.026223664397697355
1940
+ ],
1941
+ [
1942
+ "large",
1943
+ 0.02485613931372842
1944
+ ],
1945
+ [
1946
+ "information retrieval",
1947
+ 0.023226595334291323
1948
+ ]
1949
+ ],
1950
+ "7": [
1951
+ [
1952
+ "ai",
1953
+ 0.035714494707298816
1954
+ ],
1955
+ [
1956
+ "models",
1957
+ 0.02952586052201538
1958
+ ],
1959
+ [
1960
+ "language",
1961
+ 0.02834496901204038
1962
+ ],
1963
+ [
1964
+ "dialogue",
1965
+ 0.027651780633986222
1966
+ ],
1967
+ [
1968
+ "human",
1969
+ 0.026401782297458473
1970
+ ],
1971
+ [
1972
+ "llms",
1973
+ 0.025442822490930102
1974
+ ],
1975
+ [
1976
+ "chatgpt",
1977
+ 0.02318990727379372
1978
+ ],
1979
+ [
1980
+ "large language",
1981
+ 0.02271947226012763
1982
+ ],
1983
+ [
1984
+ "large",
1985
+ 0.021957413569739143
1986
+ ],
1987
+ [
1988
+ "model",
1989
+ 0.02070839524294738
1990
+ ]
1991
+ ],
1992
+ "8": [
1993
+ [
1994
+ "language",
1995
+ 0.04093864302301298
1996
+ ],
1997
+ [
1998
+ "models",
1999
+ 0.03595221175799092
2000
+ ],
2001
+ [
2002
+ "translation",
2003
+ 0.031712613088874894
2004
+ ],
2005
+ [
2006
+ "model",
2007
+ 0.030177929651754233
2008
+ ],
2009
+ [
2010
+ "language models",
2011
+ 0.026247976024177194
2012
+ ],
2013
+ [
2014
+ "text",
2015
+ 0.024834259305576166
2016
+ ],
2017
+ [
2018
+ "data",
2019
+ 0.02462670002569503
2020
+ ],
2021
+ [
2022
+ "generation",
2023
+ 0.020743602919543393
2024
+ ],
2025
+ [
2026
+ "tasks",
2027
+ 0.020568403779006268
2028
+ ],
2029
+ [
2030
+ "machine translation",
2031
+ 0.019130933056569405
2032
+ ]
2033
+ ],
2034
+ "9": [
2035
+ [
2036
+ "distillation",
2037
+ 0.04337789490301995
2038
+ ],
2039
+ [
2040
+ "model",
2041
+ 0.040261980975691315
2042
+ ],
2043
+ [
2044
+ "knowledge",
2045
+ 0.03986242788324582
2046
+ ],
2047
+ [
2048
+ "pretrained",
2049
+ 0.039810767531247584
2050
+ ],
2051
+ [
2052
+ "student",
2053
+ 0.03578735650250997
2054
+ ],
2055
+ [
2056
+ "models",
2057
+ 0.03577800012735637
2058
+ ],
2059
+ [
2060
+ "teacher",
2061
+ 0.034995506692116485
2062
+ ],
2063
+ [
2064
+ "30",
2065
+ 0.03383519763051433
2066
+ ],
2067
+ [
2068
+ "pretraining",
2069
+ 0.030341356455441396
2070
+ ],
2071
+ [
2072
+ "language",
2073
+ 0.029777618334555132
2074
+ ]
2075
+ ],
2076
+ "10": [
2077
+ [
2078
+ "training",
2079
+ 0.039846773292934345
2080
+ ],
2081
+ [
2082
+ "model",
2083
+ 0.03354112714562384
2084
+ ],
2085
+ [
2086
+ "models",
2087
+ 0.03309176444172136
2088
+ ],
2089
+ [
2090
+ "transformer",
2091
+ 0.02791942196230748
2092
+ ],
2093
+ [
2094
+ "language",
2095
+ 0.024257026345120718
2096
+ ],
2097
+ [
2098
+ "finetuning",
2099
+ 0.022555042408780118
2100
+ ],
2101
+ [
2102
+ "large",
2103
+ 0.02231168487342353
2104
+ ],
2105
+ [
2106
+ "quantization",
2107
+ 0.021953720153927197
2108
+ ],
2109
+ [
2110
+ "transformers",
2111
+ 0.02143379388468265
2112
+ ],
2113
+ [
2114
+ "tasks",
2115
+ 0.020718276629461102
2116
+ ]
2117
+ ]
2118
+ },
2119
+ "POS": {
2120
+ "-1": [
2121
+ [
2122
+ "models",
2123
+ 0.03389659414835857
2124
+ ],
2125
+ [
2126
+ "language",
2127
+ 0.02862851018591238
2128
+ ],
2129
+ [
2130
+ "data",
2131
+ 0.025761470494060225
2132
+ ],
2133
+ [
2134
+ "large",
2135
+ 0.022745338330438167
2136
+ ],
2137
+ [
2138
+ "tasks",
2139
+ 0.019900039131413454
2140
+ ],
2141
+ [
2142
+ "model",
2143
+ 0.019776504965187965
2144
+ ],
2145
+ [
2146
+ "large language",
2147
+ 0.018796014150363968
2148
+ ],
2149
+ [
2150
+ "learning",
2151
+ 0.016344890778099884
2152
+ ],
2153
+ [
2154
+ "knowledge",
2155
+ 0.014791777335488
2156
+ ],
2157
+ [
2158
+ "performance",
2159
+ 0.014448725147256262
2160
+ ]
2161
+ ],
2162
+ "0": [
2163
+ [
2164
+ "multimodal",
2165
+ 0.06136737485351937
2166
+ ],
2167
+ [
2168
+ "visual",
2169
+ 0.05853719763492334
2170
+ ],
2171
+ [
2172
+ "image",
2173
+ 0.0485116842448941
2174
+ ],
2175
+ [
2176
+ "models",
2177
+ 0.04071455273538424
2178
+ ],
2179
+ [
2180
+ "generation",
2181
+ 0.03666106634697344
2182
+ ],
2183
+ [
2184
+ "video",
2185
+ 0.03347983737726034
2186
+ ],
2187
+ [
2188
+ "understanding",
2189
+ 0.030394954956701734
2190
+ ],
2191
+ [
2192
+ "large",
2193
+ 0.02799241301244745
2194
+ ],
2195
+ [
2196
+ "instruction",
2197
+ 0.026460918679033146
2198
+ ],
2199
+ [
2200
+ "model",
2201
+ 0.025928671364185387
2202
+ ]
2203
+ ],
2204
+ "1": [
2205
+ [
2206
+ "speech",
2207
+ 0.11570004480898892
2208
+ ],
2209
+ [
2210
+ "text",
2211
+ 0.045675035457062994
2212
+ ],
2213
+ [
2214
+ "speaker",
2215
+ 0.04413188327842426
2216
+ ],
2217
+ [
2218
+ "recognition",
2219
+ 0.0425835487386093
2220
+ ],
2221
+ [
2222
+ "model",
2223
+ 0.030672757538167814
2224
+ ],
2225
+ [
2226
+ "voice",
2227
+ 0.030005193176805063
2228
+ ],
2229
+ [
2230
+ "language",
2231
+ 0.028788932413434717
2232
+ ],
2233
+ [
2234
+ "systems",
2235
+ 0.02748631604741655
2236
+ ],
2237
+ [
2238
+ "error",
2239
+ 0.02657560180020219
2240
+ ],
2241
+ [
2242
+ "prompt",
2243
+ 0.026226831648547774
2244
+ ]
2245
+ ],
2246
+ "2": [
2247
+ [
2248
+ "detection",
2249
+ 0.04428034234306033
2250
+ ],
2251
+ [
2252
+ "models",
2253
+ 0.034572645259843875
2254
+ ],
2255
+ [
2256
+ "text",
2257
+ 0.034327288774788946
2258
+ ],
2259
+ [
2260
+ "language",
2261
+ 0.032811332249369544
2262
+ ],
2263
+ [
2264
+ "model",
2265
+ 0.027098162035851945
2266
+ ],
2267
+ [
2268
+ "large",
2269
+ 0.02498863472584851
2270
+ ],
2271
+ [
2272
+ "misinformation",
2273
+ 0.021777813891068417
2274
+ ],
2275
+ [
2276
+ "dataset",
2277
+ 0.020171781105023295
2278
+ ],
2279
+ [
2280
+ "large language",
2281
+ 0.019042601273926114
2282
+ ],
2283
+ [
2284
+ "bias",
2285
+ 0.018565158646766316
2286
+ ]
2287
+ ],
2288
+ "3": [
2289
+ [
2290
+ "code",
2291
+ 0.0770374212782715
2292
+ ],
2293
+ [
2294
+ "language",
2295
+ 0.03246858995960753
2296
+ ],
2297
+ [
2298
+ "models",
2299
+ 0.03135048850848837
2300
+ ],
2301
+ [
2302
+ "programming",
2303
+ 0.031032804646709017
2304
+ ],
2305
+ [
2306
+ "software",
2307
+ 0.02338912467302265
2308
+ ],
2309
+ [
2310
+ "tasks",
2311
+ 0.020079101943328282
2312
+ ],
2313
+ [
2314
+ "model",
2315
+ 0.019589308762890383
2316
+ ],
2317
+ [
2318
+ "large language",
2319
+ 0.01903123933999179
2320
+ ],
2321
+ [
2322
+ "large",
2323
+ 0.018419645004564857
2324
+ ],
2325
+ [
2326
+ "program",
2327
+ 0.01732377045192171
2328
+ ]
2329
+ ],
2330
+ "4": [
2331
+ [
2332
+ "agents",
2333
+ 0.03144148115888399
2334
+ ],
2335
+ [
2336
+ "policy",
2337
+ 0.030578201460243676
2338
+ ],
2339
+ [
2340
+ "language",
2341
+ 0.02958531504638217
2342
+ ],
2343
+ [
2344
+ "learning",
2345
+ 0.028956274103493298
2346
+ ],
2347
+ [
2348
+ "tasks",
2349
+ 0.02750394011875191
2350
+ ],
2351
+ [
2352
+ "agent",
2353
+ 0.025182048261377144
2354
+ ],
2355
+ [
2356
+ "games",
2357
+ 0.024542318750706546
2358
+ ],
2359
+ [
2360
+ "knowledge",
2361
+ 0.02370911390325935
2362
+ ],
2363
+ [
2364
+ "model",
2365
+ 0.022937914034634292
2366
+ ],
2367
+ [
2368
+ "models",
2369
+ 0.021670826257073117
2370
+ ]
2371
+ ],
2372
+ "5": [
2373
+ [
2374
+ "reasoning",
2375
+ 0.0929222616473664
2376
+ ],
2377
+ [
2378
+ "problems",
2379
+ 0.03708374749407663
2380
+ ],
2381
+ [
2382
+ "models",
2383
+ 0.037066520645951874
2384
+ ],
2385
+ [
2386
+ "commonsense",
2387
+ 0.03179862849063796
2388
+ ],
2389
+ [
2390
+ "prompting",
2391
+ 0.029329865166081277
2392
+ ],
2393
+ [
2394
+ "language",
2395
+ 0.028264168425802164
2396
+ ],
2397
+ [
2398
+ "math",
2399
+ 0.025706499789005296
2400
+ ],
2401
+ [
2402
+ "performance",
2403
+ 0.023715301369860727
2404
+ ],
2405
+ [
2406
+ "model",
2407
+ 0.02348865107952526
2408
+ ],
2409
+ [
2410
+ "large",
2411
+ 0.0226412358105249
2412
+ ]
2413
+ ],
2414
+ "6": [
2415
+ [
2416
+ "retrieval",
2417
+ 0.05202021714558848
2418
+ ],
2419
+ [
2420
+ "information",
2421
+ 0.03944643968452574
2422
+ ],
2423
+ [
2424
+ "query",
2425
+ 0.03862936737060072
2426
+ ],
2427
+ [
2428
+ "queries",
2429
+ 0.030743284058903933
2430
+ ],
2431
+ [
2432
+ "models",
2433
+ 0.029746047746957945
2434
+ ],
2435
+ [
2436
+ "language",
2437
+ 0.029551563260931873
2438
+ ],
2439
+ [
2440
+ "large",
2441
+ 0.02485613931372842
2442
+ ],
2443
+ [
2444
+ "augmentation",
2445
+ 0.02171476619738611
2446
+ ],
2447
+ [
2448
+ "results",
2449
+ 0.020391690505114853
2450
+ ],
2451
+ [
2452
+ "generative",
2453
+ 0.019244542166013356
2454
+ ]
2455
+ ],
2456
+ "7": [
2457
+ [
2458
+ "models",
2459
+ 0.02952586052201538
2460
+ ],
2461
+ [
2462
+ "language",
2463
+ 0.02834496901204038
2464
+ ],
2465
+ [
2466
+ "dialogue",
2467
+ 0.027651780633986222
2468
+ ],
2469
+ [
2470
+ "human",
2471
+ 0.026401782297458473
2472
+ ],
2473
+ [
2474
+ "large language",
2475
+ 0.02271947226012763
2476
+ ],
2477
+ [
2478
+ "large",
2479
+ 0.021957413569739143
2480
+ ],
2481
+ [
2482
+ "model",
2483
+ 0.02070839524294738
2484
+ ],
2485
+ [
2486
+ "chatbots",
2487
+ 0.0204145670075834
2488
+ ],
2489
+ [
2490
+ "responses",
2491
+ 0.019623949467271785
2492
+ ],
2493
+ [
2494
+ "agents",
2495
+ 0.018653284453243282
2496
+ ]
2497
+ ],
2498
+ "8": [
2499
+ [
2500
+ "language",
2501
+ 0.04093864302301298
2502
+ ],
2503
+ [
2504
+ "models",
2505
+ 0.03595221175799092
2506
+ ],
2507
+ [
2508
+ "translation",
2509
+ 0.031712613088874894
2510
+ ],
2511
+ [
2512
+ "model",
2513
+ 0.030177929651754233
2514
+ ],
2515
+ [
2516
+ "text",
2517
+ 0.024834259305576166
2518
+ ],
2519
+ [
2520
+ "data",
2521
+ 0.02462670002569503
2522
+ ],
2523
+ [
2524
+ "generation",
2525
+ 0.020743602919543393
2526
+ ],
2527
+ [
2528
+ "tasks",
2529
+ 0.020568403779006268
2530
+ ],
2531
+ [
2532
+ "machine",
2533
+ 0.01848825539347313
2534
+ ],
2535
+ [
2536
+ "large",
2537
+ 0.018176145065047958
2538
+ ]
2539
+ ],
2540
+ "9": [
2541
+ [
2542
+ "distillation",
2543
+ 0.04337789490301995
2544
+ ],
2545
+ [
2546
+ "model",
2547
+ 0.040261980975691315
2548
+ ],
2549
+ [
2550
+ "knowledge",
2551
+ 0.03986242788324582
2552
+ ],
2553
+ [
2554
+ "student",
2555
+ 0.03578735650250997
2556
+ ],
2557
+ [
2558
+ "models",
2559
+ 0.03577800012735637
2560
+ ],
2561
+ [
2562
+ "teacher",
2563
+ 0.034995506692116485
2564
+ ],
2565
+ [
2566
+ "language",
2567
+ 0.029777618334555132
2568
+ ],
2569
+ [
2570
+ "tasks",
2571
+ 0.027081961204377804
2572
+ ],
2573
+ [
2574
+ "performance",
2575
+ 0.026439338569396797
2576
+ ],
2577
+ [
2578
+ "answer",
2579
+ 0.023503384095700217
2580
+ ]
2581
+ ],
2582
+ "10": [
2583
+ [
2584
+ "training",
2585
+ 0.039846773292934345
2586
+ ],
2587
+ [
2588
+ "model",
2589
+ 0.03354112714562384
2590
+ ],
2591
+ [
2592
+ "models",
2593
+ 0.03309176444172136
2594
+ ],
2595
+ [
2596
+ "transformer",
2597
+ 0.02791942196230748
2598
+ ],
2599
+ [
2600
+ "language",
2601
+ 0.024257026345120718
2602
+ ],
2603
+ [
2604
+ "finetuning",
2605
+ 0.022555042408780118
2606
+ ],
2607
+ [
2608
+ "large",
2609
+ 0.02231168487342353
2610
+ ],
2611
+ [
2612
+ "quantization",
2613
+ 0.021953720153927197
2614
+ ],
2615
+ [
2616
+ "transformers",
2617
+ 0.02143379388468265
2618
+ ],
2619
+ [
2620
+ "tasks",
2621
+ 0.020718276629461102
2622
+ ]
2623
+ ]
2624
+ }
2625
+ }
2626
+ }