Rchamba commited on
Commit
67fa2d2
·
verified ·
1 Parent(s): f6f5abd

Add BERTopic model

Browse files
README.md ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ tags:
4
+ - bertopic
5
+ library_name: bertopic
6
+ pipeline_tag: text-classification
7
+ ---
8
+
9
+ # bal_arxiv_scientific_paps_berttopic_model
10
+
11
+ This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model.
12
+ BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
13
+
14
+ ## Usage
15
+
16
+ To use this model, please install BERTopic:
17
+
18
+ ```
19
+ pip install -U bertopic
20
+ ```
21
+
22
+ You can use the model as follows:
23
+
24
+ ```python
25
+ from bertopic import BERTopic
26
+ topic_model = BERTopic.load("Rchamba/bal_arxiv_scientific_paps_berttopic_model")
27
+
28
+ topic_model.get_topic_info()
29
+ ```
30
+
31
+ ## Topic overview
32
+
33
+ * Number of topics: 14
34
+ * Number of training documents: 360
35
+
36
+ <details>
37
+ <summary>Click here for an overview of all topics.</summary>
38
+
39
+ | Topic ID | Topic Keywords | Topic Frequency | Label |
40
+ |----------|----------------|-----------------|-------|
41
+ | -1 | data - steganography - secret - probability - method | 13 | -1_data_steganography_secret_probability |
42
+ | 0 | sp - intelligence - processing - human - image | 35 | 0_sp_intelligence_processing_human |
43
+ | 1 | quantum - automata - finite - classical - measurement | 64 | 1_quantum_automata_finite_classical |
44
+ | 2 | problems - complexity - constraints - symmetry - csps | 37 | 2_problems_complexity_constraints_symmetry |
45
+ | 3 | logic - computability - cl - edu - www | 25 | 3_logic_computability_cl_edu |
46
+ | 4 | science - citation - journals - social - communication | 24 | 4_science_citation_journals_social |
47
+ | 5 | tetraquark - vector - bar - rm - qcd | 23 | 5_tetraquark_vector_bar_rm |
48
+ | 6 | combinatorial - problems - design - problem - clustering | 22 | 6_combinatorial_problems_design_problem |
49
+ | 7 | prediction - entropy - model - universal - cc | 22 | 7_prediction_entropy_model_universal |
50
+ | 8 | notes - informal - spaces - analysis - metric | 21 | 8_notes_informal_spaces_analysis |
51
+ | 9 | orbital - earth - postnewtonian - effects - artificial | 21 | 9_orbital_earth_postnewtonian_effects |
52
+ | 10 | keyphrases - word - algorithm - semantic - similarity | 20 | 10_keyphrases_word_algorithm_semantic |
53
+ | 11 | kernel - gmm - kernels - datasets - classification | 17 | 11_kernel_gmm_kernels_datasets |
54
+ | 12 | data - ultrametric - ultrametricity - analysis - application | 16 | 12_data_ultrametric_ultrametricity_analysis |
55
+
56
+ </details>
57
+
58
+ ## Training hyperparameters
59
+
60
+ * calculate_probabilities: True
61
+ * language: english
62
+ * low_memory: False
63
+ * min_topic_size: 10
64
+ * n_gram_range: (1, 1)
65
+ * nr_topics: None
66
+ * seed_topic_list: None
67
+ * top_n_words: 10
68
+ * verbose: False
69
+ * zeroshot_min_similarity: 0.7
70
+ * zeroshot_topic_list: None
71
+
72
+ ## Framework versions
73
+
74
+ * Numpy: 2.0.2
75
+ * HDBSCAN: 0.8.40
76
+ * UMAP: 0.5.7
77
+ * Pandas: 2.2.2
78
+ * Scikit-Learn: 1.6.1
79
+ * Sentence-transformers: 4.1.0
80
+ * Transformers: 4.52.4
81
+ * Numba: 0.60.0
82
+ * Plotly: 5.24.1
83
+ * Python: 3.11.13
config.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "calculate_probabilities": true,
3
+ "language": "english",
4
+ "low_memory": false,
5
+ "min_topic_size": 10,
6
+ "n_gram_range": [
7
+ 1,
8
+ 1
9
+ ],
10
+ "nr_topics": null,
11
+ "seed_topic_list": null,
12
+ "top_n_words": 10,
13
+ "verbose": false,
14
+ "zeroshot_min_similarity": 0.7,
15
+ "zeroshot_topic_list": null,
16
+ "embedding_model": "sentence-transformers/all-MiniLM-L6-v2"
17
+ }
ctfidf.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4d9ff73a8200cb61d07e1cde175f7f524864b9ec86379c39cbada15c92c47fb7
3
+ size 163624
ctfidf_config.json ADDED
The diff for this file is too large to render. See raw diff
 
topic_embeddings.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:97b3539dbc450f882fbe24e019319ede33da266408e799d473892717e6679084
3
+ size 21592
topics.json ADDED
@@ -0,0 +1,1061 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "topic_representations": {
3
+ "-1": [
4
+ [
5
+ "data",
6
+ 0.057213714414237525
7
+ ],
8
+ [
9
+ "steganography",
10
+ 0.052599647661913655
11
+ ],
12
+ [
13
+ "secret",
14
+ 0.0522063435635583
15
+ ],
16
+ [
17
+ "probability",
18
+ 0.03815076384857811
19
+ ],
20
+ [
21
+ "method",
22
+ 0.03175066163911729
23
+ ],
24
+ [
25
+ "image",
26
+ 0.028982776107348653
27
+ ],
28
+ [
29
+ "algorithm",
30
+ 0.02738258177973974
31
+ ],
32
+ [
33
+ "carrier",
34
+ 0.024457482497179096
35
+ ],
36
+ [
37
+ "causality",
38
+ 0.024457482497179096
39
+ ],
40
+ [
41
+ "proposed",
42
+ 0.023430151582015963
43
+ ]
44
+ ],
45
+ "0": [
46
+ [
47
+ "sp",
48
+ 0.05009619739292622
49
+ ],
50
+ [
51
+ "intelligence",
52
+ 0.043875450298459254
53
+ ],
54
+ [
55
+ "processing",
56
+ 0.03593384628066711
57
+ ],
58
+ [
59
+ "human",
60
+ 0.027940710139264247
61
+ ],
62
+ [
63
+ "image",
64
+ 0.027135685089157216
65
+ ],
66
+ [
67
+ "theory",
68
+ 0.026271179392751514
69
+ ],
70
+ [
71
+ "learning",
72
+ 0.02588408785233955
73
+ ],
74
+ [
75
+ "compression",
76
+ 0.022926015786885363
77
+ ],
78
+ [
79
+ "model",
80
+ 0.02195360230457192
81
+ ],
82
+ [
83
+ "paper",
84
+ 0.021598613725246688
85
+ ]
86
+ ],
87
+ "1": [
88
+ [
89
+ "quantum",
90
+ 0.20497830656681687
91
+ ],
92
+ [
93
+ "automata",
94
+ 0.055099704424567415
95
+ ],
96
+ [
97
+ "finite",
98
+ 0.04864256936050457
99
+ ],
100
+ [
101
+ "classical",
102
+ 0.041208613547798784
103
+ ],
104
+ [
105
+ "measurement",
106
+ 0.040871195749532914
107
+ ],
108
+ [
109
+ "advice",
110
+ 0.03833351170940505
111
+ ],
112
+ [
113
+ "computation",
114
+ 0.037999634637714746
115
+ ],
116
+ [
117
+ "complexity",
118
+ 0.033057528990887625
119
+ ],
120
+ [
121
+ "problem",
122
+ 0.03210202095116961
123
+ ],
124
+ [
125
+ "theory",
126
+ 0.02710011884994224
127
+ ]
128
+ ],
129
+ "2": [
130
+ [
131
+ "problems",
132
+ 0.07139934409935379
133
+ ],
134
+ [
135
+ "complexity",
136
+ 0.05840020118919683
137
+ ],
138
+ [
139
+ "constraints",
140
+ 0.05639359474065995
141
+ ],
142
+ [
143
+ "symmetry",
144
+ 0.05541974221386781
145
+ ],
146
+ [
147
+ "csps",
148
+ 0.050133557805763146
149
+ ],
150
+ [
151
+ "manipulation",
152
+ 0.046366212971798736
153
+ ],
154
+ [
155
+ "constraint",
156
+ 0.04197283412530866
157
+ ],
158
+ [
159
+ "fuzzy",
160
+ 0.041485558974767286
161
+ ],
162
+ [
163
+ "counting",
164
+ 0.04145148560771654
165
+ ],
166
+ [
167
+ "boolean",
168
+ 0.0343435202220906
169
+ ]
170
+ ],
171
+ "3": [
172
+ [
173
+ "logic",
174
+ 0.1313336252015882
175
+ ],
176
+ [
177
+ "computability",
178
+ 0.11963978703741909
179
+ ],
180
+ [
181
+ "cl",
182
+ 0.07735797464153449
183
+ ],
184
+ [
185
+ "edu",
186
+ 0.05598047393741615
187
+ ],
188
+ [
189
+ "www",
190
+ 0.055403124819459804
191
+ ],
192
+ [
193
+ "http",
194
+ 0.053314737083253474
195
+ ],
196
+ [
197
+ "interactive",
198
+ 0.04962493187500432
199
+ ],
200
+ [
201
+ "computational",
202
+ 0.047859136380397106
203
+ ],
204
+ [
205
+ "completeness",
206
+ 0.045227569490307644
207
+ ],
208
+ [
209
+ "logical",
210
+ 0.04359895221271631
211
+ ]
212
+ ],
213
+ "4": [
214
+ [
215
+ "science",
216
+ 0.07984223388390019
217
+ ],
218
+ [
219
+ "citation",
220
+ 0.07457055354674037
221
+ ],
222
+ [
223
+ "journals",
224
+ 0.05404283878943192
225
+ ],
226
+ [
227
+ "social",
228
+ 0.046520388066152954
229
+ ],
230
+ [
231
+ "communication",
232
+ 0.04127075098935727
233
+ ],
234
+ [
235
+ "indicators",
236
+ 0.03578827236512663
237
+ ],
238
+ [
239
+ "journal",
240
+ 0.03462338004030605
241
+ ],
242
+ [
243
+ "analysis",
244
+ 0.034436144295847786
245
+ ],
246
+ [
247
+ "sciences",
248
+ 0.032534793059206035
249
+ ],
250
+ [
251
+ "network",
252
+ 0.030657105260422916
253
+ ]
254
+ ],
255
+ "5": [
256
+ [
257
+ "tetraquark",
258
+ 0.17158932097541618
259
+ ],
260
+ [
261
+ "vector",
262
+ 0.13032159882878153
263
+ ],
264
+ [
265
+ "bar",
266
+ 0.12025089339663936
267
+ ],
268
+ [
269
+ "rm",
270
+ 0.10256693848536887
271
+ ],
272
+ [
273
+ "qcd",
274
+ 0.09804944098546507
275
+ ],
276
+ [
277
+ "sum",
278
+ 0.09189046066899752
279
+ ],
280
+ [
281
+ "scalar",
282
+ 0.08920658248755874
283
+ ],
284
+ [
285
+ "rules",
286
+ 0.08331503401745433
287
+ ],
288
+ [
289
+ "axialvector",
290
+ 0.08002397603385222
291
+ ],
292
+ [
293
+ "type",
294
+ 0.07718798696195792
295
+ ]
296
+ ],
297
+ "6": [
298
+ [
299
+ "combinatorial",
300
+ 0.07337161990187382
301
+ ],
302
+ [
303
+ "problems",
304
+ 0.07132438037428859
305
+ ],
306
+ [
307
+ "design",
308
+ 0.07045002355762316
309
+ ],
310
+ [
311
+ "problem",
312
+ 0.07007994473052086
313
+ ],
314
+ [
315
+ "clustering",
316
+ 0.06495579084509354
317
+ ],
318
+ [
319
+ "modular",
320
+ 0.06434475167067527
321
+ ],
322
+ [
323
+ "composite",
324
+ 0.04607045313437381
325
+ ],
326
+ [
327
+ "basic",
328
+ 0.045798715500111775
329
+ ],
330
+ [
331
+ "restructuring",
332
+ 0.0455356075025649
333
+ ],
334
+ [
335
+ "considered",
336
+ 0.04250409668637566
337
+ ]
338
+ ],
339
+ "7": [
340
+ [
341
+ "prediction",
342
+ 0.07346887009227779
343
+ ],
344
+ [
345
+ "entropy",
346
+ 0.049388020882393135
347
+ ],
348
+ [
349
+ "model",
350
+ 0.04342223351286927
351
+ ],
352
+ [
353
+ "universal",
354
+ 0.041156684068660944
355
+ ],
356
+ [
357
+ "cc",
358
+ 0.0398821362951796
359
+ ],
360
+ [
361
+ "sequence",
362
+ 0.03833297334128056
363
+ ],
364
+ [
365
+ "frequency",
366
+ 0.03696782735322339
367
+ ],
368
+ [
369
+ "moments",
370
+ 0.03687062642745131
371
+ ],
372
+ [
373
+ "prior",
374
+ 0.035545166474997086
375
+ ],
376
+ [
377
+ "delta",
378
+ 0.035164591875029516
379
+ ]
380
+ ],
381
+ "8": [
382
+ [
383
+ "notes",
384
+ 0.3043375606150314
385
+ ],
386
+ [
387
+ "informal",
388
+ 0.26686741275958176
389
+ ],
390
+ [
391
+ "spaces",
392
+ 0.20902267911369513
393
+ ],
394
+ [
395
+ "analysis",
396
+ 0.14623695965201486
397
+ ],
398
+ [
399
+ "metric",
400
+ 0.12849072566009637
401
+ ],
402
+ [
403
+ "harmonic",
404
+ 0.11320533811885151
405
+ ],
406
+ [
407
+ "basic",
408
+ 0.1124604422877309
409
+ ],
410
+ [
411
+ "fourier",
412
+ 0.11075459909298217
413
+ ],
414
+ [
415
+ "heisenberg",
416
+ 0.09933027490651725
417
+ ],
418
+ [
419
+ "groups",
420
+ 0.09933027490651725
421
+ ]
422
+ ],
423
+ "9": [
424
+ [
425
+ "orbital",
426
+ 0.07637392503225787
427
+ ],
428
+ [
429
+ "earth",
430
+ 0.050613754906920376
431
+ ],
432
+ [
433
+ "postnewtonian",
434
+ 0.043777907886927266
435
+ ],
436
+ [
437
+ "effects",
438
+ 0.04338321849164604
439
+ ],
440
+ [
441
+ "artificial",
442
+ 0.04001276960809818
443
+ ],
444
+ [
445
+ "gravitational",
446
+ 0.039654517132986525
447
+ ],
448
+ [
449
+ "body",
450
+ 0.03771598101182827
451
+ ],
452
+ [
453
+ "relativistic",
454
+ 0.037539658495641476
455
+ ],
456
+ [
457
+ "lageos",
458
+ 0.035385955990421165
459
+ ],
460
+ [
461
+ "particle",
462
+ 0.03390819710407447
463
+ ]
464
+ ],
465
+ "10": [
466
+ [
467
+ "keyphrases",
468
+ 0.07085733453003161
469
+ ],
470
+ [
471
+ "word",
472
+ 0.06804636658283127
473
+ ],
474
+ [
475
+ "algorithm",
476
+ 0.05839577931529745
477
+ ],
478
+ [
479
+ "semantic",
480
+ 0.054628755229922506
481
+ ],
482
+ [
483
+ "similarity",
484
+ 0.053418162747602645
485
+ ],
486
+ [
487
+ "analogy",
488
+ 0.05002370392996622
489
+ ],
490
+ [
491
+ "task",
492
+ 0.04050898729482372
493
+ ],
494
+ [
495
+ "learning",
496
+ 0.0404610351950463
497
+ ],
498
+ [
499
+ "relational",
500
+ 0.03978351019980765
501
+ ],
502
+ [
503
+ "text",
504
+ 0.03938901660337154
505
+ ]
506
+ ],
507
+ "11": [
508
+ [
509
+ "kernel",
510
+ 0.13866715939113777
511
+ ],
512
+ [
513
+ "gmm",
514
+ 0.09677421786257459
515
+ ],
516
+ [
517
+ "kernels",
518
+ 0.0727196077540381
519
+ ],
520
+ [
521
+ "datasets",
522
+ 0.06771493033489531
523
+ ],
524
+ [
525
+ "classification",
526
+ 0.06188244594279738
527
+ ],
528
+ [
529
+ "logitboost",
530
+ 0.04972358274422686
531
+ ],
532
+ [
533
+ "learning",
534
+ 0.04638112350253201
535
+ ],
536
+ [
537
+ "abclogitboost",
538
+ 0.04396268235421301
539
+ ],
540
+ [
541
+ "algorithms",
542
+ 0.04247139903824243
543
+ ],
544
+ [
545
+ "minmax",
546
+ 0.040994314909468886
547
+ ]
548
+ ],
549
+ "12": [
550
+ [
551
+ "data",
552
+ 0.17496854532139364
553
+ ],
554
+ [
555
+ "ultrametric",
556
+ 0.13997092354785093
557
+ ],
558
+ [
559
+ "ultrametricity",
560
+ 0.0734471388942075
561
+ ],
562
+ [
563
+ "analysis",
564
+ 0.07211341481121694
565
+ ],
566
+ [
567
+ "application",
568
+ 0.06323234281193223
569
+ ],
570
+ [
571
+ "high",
572
+ 0.05303568983540695
573
+ ],
574
+ [
575
+ "structure",
576
+ 0.05214915215790164
577
+ ],
578
+ [
579
+ "topology",
580
+ 0.043419153165477786
581
+ ],
582
+ [
583
+ "analytics",
584
+ 0.04247918854100824
585
+ ],
586
+ [
587
+ "dimensional",
588
+ 0.04247918854100824
589
+ ]
590
+ ]
591
+ },
592
+ "topics": [
593
+ 10,
594
+ 10,
595
+ 10,
596
+ 10,
597
+ -1,
598
+ -1,
599
+ 11,
600
+ 10,
601
+ 0,
602
+ 10,
603
+ 10,
604
+ 11,
605
+ 10,
606
+ 10,
607
+ 10,
608
+ 10,
609
+ 10,
610
+ 0,
611
+ 10,
612
+ 10,
613
+ 1,
614
+ 1,
615
+ 0,
616
+ 1,
617
+ 3,
618
+ 1,
619
+ 1,
620
+ 1,
621
+ 0,
622
+ 0,
623
+ 1,
624
+ 1,
625
+ 1,
626
+ 1,
627
+ 1,
628
+ 1,
629
+ 0,
630
+ 1,
631
+ 1,
632
+ -1,
633
+ 7,
634
+ 7,
635
+ 7,
636
+ 7,
637
+ 7,
638
+ 7,
639
+ 7,
640
+ 7,
641
+ 7,
642
+ -1,
643
+ 7,
644
+ 7,
645
+ 7,
646
+ 7,
647
+ 7,
648
+ 0,
649
+ 2,
650
+ 2,
651
+ 7,
652
+ 0,
653
+ 3,
654
+ 3,
655
+ 3,
656
+ 3,
657
+ 3,
658
+ 3,
659
+ 3,
660
+ 3,
661
+ 3,
662
+ 3,
663
+ 3,
664
+ 3,
665
+ 3,
666
+ 3,
667
+ 3,
668
+ 3,
669
+ 3,
670
+ 3,
671
+ 3,
672
+ 3,
673
+ 4,
674
+ 4,
675
+ -1,
676
+ 4,
677
+ 4,
678
+ 4,
679
+ 4,
680
+ 4,
681
+ 4,
682
+ 4,
683
+ 4,
684
+ 4,
685
+ 4,
686
+ 4,
687
+ 4,
688
+ 4,
689
+ 4,
690
+ 4,
691
+ 4,
692
+ 4,
693
+ 5,
694
+ 5,
695
+ 5,
696
+ 5,
697
+ 5,
698
+ 5,
699
+ 5,
700
+ 5,
701
+ 5,
702
+ 5,
703
+ 5,
704
+ 5,
705
+ 5,
706
+ 5,
707
+ 5,
708
+ 5,
709
+ 5,
710
+ 5,
711
+ 5,
712
+ 5,
713
+ 12,
714
+ 12,
715
+ 12,
716
+ 12,
717
+ 4,
718
+ 12,
719
+ 12,
720
+ 12,
721
+ -1,
722
+ 4,
723
+ 4,
724
+ 12,
725
+ -1,
726
+ 12,
727
+ 11,
728
+ 12,
729
+ 11,
730
+ 2,
731
+ 12,
732
+ 12,
733
+ 0,
734
+ 0,
735
+ 0,
736
+ 0,
737
+ 0,
738
+ 0,
739
+ 0,
740
+ 0,
741
+ 0,
742
+ 0,
743
+ 0,
744
+ 0,
745
+ 0,
746
+ 0,
747
+ 0,
748
+ 0,
749
+ 0,
750
+ 0,
751
+ 0,
752
+ 0,
753
+ 9,
754
+ 9,
755
+ 9,
756
+ 9,
757
+ 9,
758
+ 9,
759
+ 9,
760
+ 9,
761
+ 9,
762
+ 9,
763
+ 9,
764
+ 9,
765
+ 9,
766
+ 9,
767
+ 9,
768
+ 9,
769
+ 9,
770
+ 9,
771
+ 9,
772
+ 9,
773
+ 3,
774
+ -1,
775
+ -1,
776
+ -1,
777
+ -1,
778
+ -1,
779
+ -1,
780
+ -1,
781
+ 8,
782
+ -1,
783
+ 1,
784
+ -1,
785
+ -1,
786
+ -1,
787
+ 3,
788
+ -1,
789
+ 8,
790
+ 3,
791
+ -1,
792
+ -1,
793
+ 1,
794
+ 1,
795
+ 1,
796
+ 2,
797
+ 1,
798
+ 1,
799
+ 2,
800
+ 2,
801
+ 2,
802
+ 2,
803
+ 1,
804
+ 1,
805
+ 2,
806
+ 2,
807
+ 1,
808
+ 2,
809
+ 1,
810
+ -1,
811
+ 2,
812
+ 1,
813
+ 0,
814
+ 0,
815
+ 0,
816
+ 0,
817
+ 0,
818
+ 0,
819
+ 0,
820
+ 0,
821
+ 0,
822
+ 0,
823
+ 0,
824
+ 0,
825
+ 0,
826
+ 0,
827
+ 0,
828
+ 0,
829
+ 0,
830
+ 0,
831
+ 0,
832
+ 0,
833
+ 1,
834
+ 1,
835
+ -1,
836
+ 1,
837
+ 0,
838
+ 1,
839
+ 0,
840
+ 7,
841
+ 1,
842
+ 1,
843
+ 1,
844
+ -1,
845
+ 1,
846
+ 1,
847
+ 1,
848
+ 0,
849
+ 1,
850
+ 1,
851
+ 0,
852
+ 0,
853
+ 8,
854
+ 8,
855
+ 8,
856
+ 8,
857
+ 8,
858
+ 8,
859
+ 8,
860
+ 8,
861
+ 8,
862
+ 8,
863
+ 8,
864
+ 8,
865
+ 8,
866
+ 8,
867
+ 8,
868
+ 8,
869
+ 8,
870
+ -1,
871
+ 8,
872
+ 8,
873
+ 6,
874
+ 6,
875
+ 6,
876
+ 6,
877
+ 6,
878
+ 6,
879
+ 6,
880
+ 6,
881
+ 6,
882
+ 6,
883
+ 6,
884
+ 6,
885
+ 6,
886
+ 6,
887
+ 6,
888
+ 6,
889
+ 6,
890
+ 6,
891
+ 6,
892
+ 6,
893
+ 11,
894
+ 11,
895
+ 7,
896
+ 11,
897
+ 11,
898
+ 11,
899
+ 11,
900
+ 11,
901
+ 5,
902
+ 7,
903
+ 7,
904
+ 11,
905
+ 11,
906
+ 5,
907
+ -1,
908
+ 11,
909
+ 7,
910
+ 7,
911
+ 11,
912
+ 11,
913
+ 10,
914
+ 0,
915
+ -1,
916
+ -1,
917
+ -1,
918
+ 0,
919
+ 10,
920
+ 0,
921
+ -1,
922
+ 10,
923
+ 0,
924
+ -1,
925
+ 12,
926
+ 0,
927
+ 6,
928
+ 0,
929
+ -1,
930
+ 0,
931
+ -1,
932
+ 0,
933
+ 4,
934
+ 2,
935
+ 2,
936
+ 2,
937
+ 2,
938
+ 6,
939
+ 2,
940
+ 2,
941
+ 2,
942
+ -1,
943
+ 2,
944
+ 2,
945
+ 2,
946
+ -1,
947
+ 0,
948
+ 0,
949
+ 2,
950
+ 0,
951
+ 2,
952
+ 2
953
+ ],
954
+ "topic_sizes": {
955
+ "10": 17,
956
+ "-1": 35,
957
+ "11": 16,
958
+ "0": 64,
959
+ "1": 37,
960
+ "3": 24,
961
+ "7": 21,
962
+ "2": 25,
963
+ "4": 23,
964
+ "5": 22,
965
+ "12": 13,
966
+ "9": 20,
967
+ "8": 21,
968
+ "6": 22
969
+ },
970
+ "topic_mapper": [
971
+ [
972
+ -1,
973
+ -1,
974
+ -1
975
+ ],
976
+ [
977
+ 0,
978
+ 0,
979
+ 9
980
+ ],
981
+ [
982
+ 1,
983
+ 1,
984
+ 5
985
+ ],
986
+ [
987
+ 2,
988
+ 2,
989
+ 6
990
+ ],
991
+ [
992
+ 3,
993
+ 3,
994
+ 8
995
+ ],
996
+ [
997
+ 4,
998
+ 4,
999
+ 10
1000
+ ],
1001
+ [
1002
+ 5,
1003
+ 5,
1004
+ 11
1005
+ ],
1006
+ [
1007
+ 6,
1008
+ 6,
1009
+ 4
1010
+ ],
1011
+ [
1012
+ 7,
1013
+ 7,
1014
+ 12
1015
+ ],
1016
+ [
1017
+ 8,
1018
+ 8,
1019
+ 1
1020
+ ],
1021
+ [
1022
+ 9,
1023
+ 9,
1024
+ 7
1025
+ ],
1026
+ [
1027
+ 10,
1028
+ 10,
1029
+ 0
1030
+ ],
1031
+ [
1032
+ 11,
1033
+ 11,
1034
+ 3
1035
+ ],
1036
+ [
1037
+ 12,
1038
+ 12,
1039
+ 2
1040
+ ]
1041
+ ],
1042
+ "topic_labels": {
1043
+ "-1": "-1_data_steganography_secret_probability",
1044
+ "0": "0_sp_intelligence_processing_human",
1045
+ "1": "1_quantum_automata_finite_classical",
1046
+ "2": "2_problems_complexity_constraints_symmetry",
1047
+ "3": "3_logic_computability_cl_edu",
1048
+ "4": "4_science_citation_journals_social",
1049
+ "5": "5_tetraquark_vector_bar_rm",
1050
+ "6": "6_combinatorial_problems_design_problem",
1051
+ "7": "7_prediction_entropy_model_universal",
1052
+ "8": "8_notes_informal_spaces_analysis",
1053
+ "9": "9_orbital_earth_postnewtonian_effects",
1054
+ "10": "10_keyphrases_word_algorithm_semantic",
1055
+ "11": "11_kernel_gmm_kernels_datasets",
1056
+ "12": "12_data_ultrametric_ultrametricity_analysis"
1057
+ },
1058
+ "custom_labels": null,
1059
+ "_outliers": 1,
1060
+ "topic_aspects": {}
1061
+ }