Clemylia commited on
Commit
3217a16
·
verified ·
1 Parent(s): 0d7d09e

Entraînement de SoraForSLM terminé

Browse files
Files changed (3) hide show
  1. README.md +54 -0
  2. tokenizer.json +2772 -0
  3. tokenizer_config.json +20 -0
README.md ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - generated_from_trainer
5
+ model-index:
6
+ - name: SoraForSLM-1
7
+ results: []
8
+ ---
9
+
10
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
+ should probably proofread and complete it, then remove this comment. -->
12
+
13
+ # SoraForSLM-1
14
+
15
+ This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
16
+
17
+ ## Model description
18
+
19
+ More information needed
20
+
21
+ ## Intended uses & limitations
22
+
23
+ More information needed
24
+
25
+ ## Training and evaluation data
26
+
27
+ More information needed
28
+
29
+ ## Training procedure
30
+
31
+ ### Training hyperparameters
32
+
33
+ The following hyperparameters were used during training:
34
+ - learning_rate: 0.0004
35
+ - train_batch_size: 8
36
+ - eval_batch_size: 8
37
+ - seed: 42
38
+ - gradient_accumulation_steps: 4
39
+ - total_train_batch_size: 32
40
+ - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
41
+ - lr_scheduler_type: linear
42
+ - num_epochs: 5
43
+ - mixed_precision_training: Native AMP
44
+
45
+ ### Training results
46
+
47
+
48
+
49
+ ### Framework versions
50
+
51
+ - Transformers 5.0.0
52
+ - Pytorch 2.10.0+cu128
53
+ - Datasets 4.0.0
54
+ - Tokenizers 0.22.2
tokenizer.json ADDED
@@ -0,0 +1,2772 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": "1.0",
3
+ "truncation": null,
4
+ "padding": null,
5
+ "added_tokens": [
6
+ {
7
+ "id": 0,
8
+ "content": "<pad>",
9
+ "single_word": false,
10
+ "lstrip": false,
11
+ "rstrip": false,
12
+ "normalized": false,
13
+ "special": true
14
+ },
15
+ {
16
+ "id": 1,
17
+ "content": "</s>",
18
+ "single_word": false,
19
+ "lstrip": false,
20
+ "rstrip": false,
21
+ "normalized": false,
22
+ "special": true
23
+ },
24
+ {
25
+ "id": 2,
26
+ "content": "<s>",
27
+ "single_word": false,
28
+ "lstrip": false,
29
+ "rstrip": false,
30
+ "normalized": false,
31
+ "special": true
32
+ },
33
+ {
34
+ "id": 3,
35
+ "content": "<unk>",
36
+ "single_word": false,
37
+ "lstrip": false,
38
+ "rstrip": false,
39
+ "normalized": false,
40
+ "special": true
41
+ },
42
+ {
43
+ "id": 4,
44
+ "content": "<maskSub>",
45
+ "single_word": false,
46
+ "lstrip": false,
47
+ "rstrip": false,
48
+ "normalized": false,
49
+ "special": false
50
+ },
51
+ {
52
+ "id": 5,
53
+ "content": "Question:",
54
+ "single_word": false,
55
+ "lstrip": false,
56
+ "rstrip": false,
57
+ "normalized": false,
58
+ "special": false
59
+ },
60
+ {
61
+ "id": 6,
62
+ "content": "Réponse:",
63
+ "single_word": false,
64
+ "lstrip": false,
65
+ "rstrip": false,
66
+ "normalized": false,
67
+ "special": false
68
+ },
69
+ {
70
+ "id": 2628,
71
+ "content": "<mask>",
72
+ "single_word": false,
73
+ "lstrip": false,
74
+ "rstrip": false,
75
+ "normalized": false,
76
+ "special": true
77
+ }
78
+ ],
79
+ "normalizer": {
80
+ "type": "Replace",
81
+ "pattern": {
82
+ "String": " "
83
+ },
84
+ "content": "▁"
85
+ },
86
+ "pre_tokenizer": null,
87
+ "post_processor": {
88
+ "type": "TemplateProcessing",
89
+ "single": [
90
+ {
91
+ "Sequence": {
92
+ "id": "A",
93
+ "type_id": 0
94
+ }
95
+ }
96
+ ],
97
+ "pair": [
98
+ {
99
+ "Sequence": {
100
+ "id": "A",
101
+ "type_id": 0
102
+ }
103
+ },
104
+ {
105
+ "Sequence": {
106
+ "id": "B",
107
+ "type_id": 1
108
+ }
109
+ }
110
+ ],
111
+ "special_tokens": {}
112
+ },
113
+ "decoder": {
114
+ "type": "Sequence",
115
+ "decoders": [
116
+ {
117
+ "type": "Replace",
118
+ "pattern": {
119
+ "String": "▁"
120
+ },
121
+ "content": " "
122
+ },
123
+ {
124
+ "type": "ByteFallback"
125
+ },
126
+ {
127
+ "type": "Fuse"
128
+ }
129
+ ]
130
+ },
131
+ "model": {
132
+ "type": "BPE",
133
+ "dropout": null,
134
+ "unk_token": "<unk>",
135
+ "continuing_subword_prefix": null,
136
+ "end_of_word_suffix": null,
137
+ "fuse_unk": true,
138
+ "byte_fallback": true,
139
+ "ignore_merges": false,
140
+ "vocab": {
141
+ "<pad>": 0,
142
+ "</s>": 1,
143
+ "<s>": 2,
144
+ "<unk>": 3,
145
+ "<maskSub>": 4,
146
+ "Question:": 5,
147
+ "Réponse:": 6,
148
+ "▁the": 7,
149
+ "▁and": 8,
150
+ "▁to": 9,
151
+ "▁of": 10,
152
+ "▁a": 11,
153
+ "▁your": 12,
154
+ "▁is": 13,
155
+ "▁model": 14,
156
+ "▁data": 15,
157
+ "▁you": 16,
158
+ "▁in": 17,
159
+ "▁training": 18,
160
+ "▁with": 19,
161
+ "▁or": 20,
162
+ "▁for": 21,
163
+ "▁that": 22,
164
+ "▁are": 23,
165
+ "▁can": 24,
166
+ "▁on": 25,
167
+ "▁models": 26,
168
+ "▁by": 27,
169
+ "▁AI": 28,
170
+ "▁be": 29,
171
+ "▁The": 30,
172
+ "▁an": 31,
173
+ "▁it": 32,
174
+ "▁LLM": 33,
175
+ "▁as": 34,
176
+ "▁from": 35,
177
+ "▁use": 36,
178
+ "▁This": 37,
179
+ "▁language": 38,
180
+ "▁into": 39,
181
+ "▁train": 40,
182
+ "▁You": 41,
183
+ "▁may": 42,
184
+ "▁own": 43,
185
+ "▁Model": 44,
186
+ "▁like": 45,
187
+ "▁LLMs": 46,
188
+ "▁have": 47,
189
+ "▁this": 48,
190
+ "▁text": 49,
191
+ "▁Data": 50,
192
+ "▁learning": 51,
193
+ "▁will": 52,
194
+ "▁data.": 53,
195
+ "▁large": 54,
196
+ "▁through": 55,
197
+ "▁its": 56,
198
+ "▁more": 57,
199
+ "▁need": 58,
200
+ "▁which": 59,
201
+ "▁Your": 60,
202
+ "▁different": 61,
203
+ "▁even": 62,
204
+ "▁our": 63,
205
+ "▁reasoning": 64,
206
+ "▁using": 65,
207
+ "▁Language": 66,
208
+ "▁Large": 67,
209
+ "▁Training": 68,
210
+ "▁existing": 69,
211
+ "▁not": 70,
212
+ "▁specific": 71,
213
+ "▁such": 72,
214
+ "▁trained": 73,
215
+ "▁-": 74,
216
+ "▁how": 75,
217
+ "▁improve": 76,
218
+ "▁process": 77,
219
+ "▁In": 78,
220
+ "▁It": 79,
221
+ "▁about": 80,
222
+ "▁generate": 81,
223
+ "▁including": 82,
224
+ "▁step": 83,
225
+ "▁these": 84,
226
+ "▁used": 85,
227
+ "▁when": 86,
228
+ "▁#": 87,
229
+ "▁To": 88,
230
+ "▁custom": 89,
231
+ "▁feedback": 90,
232
+ "▁simple": 91,
233
+ "▁up": 92,
234
+ "▁For": 93,
235
+ "▁I": 94,
236
+ "▁We": 95,
237
+ "▁architecture": 96,
238
+ "▁company": 97,
239
+ "▁dataset": 98,
240
+ "▁learn": 99,
241
+ "▁models.": 100,
242
+ "▁performance": 101,
243
+ "▁require": 102,
244
+ "▁smaller": 103,
245
+ "▁tasks": 104,
246
+ "▁their": 105,
247
+ "▁they": 106,
248
+ "▁time": 107,
249
+ "▁access": 108,
250
+ "▁content": 109,
251
+ "▁could": 110,
252
+ "▁has": 111,
253
+ "▁if": 112,
254
+ "▁it’s": 113,
255
+ "▁make": 114,
256
+ "▁model,": 115,
257
+ "▁steps": 116,
258
+ "▁A": 117,
259
+ "▁By": 118,
260
+ "▁Einstein": 119,
261
+ "▁GPU": 120,
262
+ "▁Trust": 121,
263
+ "▁any": 122,
264
+ "▁data,": 123,
265
+ "▁do": 124,
266
+ "▁first": 125,
267
+ "▁model.": 126,
268
+ "▁needs": 127,
269
+ "▁start": 128,
270
+ "▁transformer": 129,
271
+ "▁user": 130,
272
+ "▁we": 131,
273
+ "▁you’ll": 132,
274
+ "▁1.": 133,
275
+ "▁5": 134,
276
+ "▁Collection": 135,
277
+ "▁Google": 136,
278
+ "▁How": 137,
279
+ "▁Layer": 138,
280
+ "▁Salesforce": 139,
281
+ "▁Step": 140,
282
+ "▁Use": 141,
283
+ "▁When": 142,
284
+ "▁all": 143,
285
+ "▁allows": 144,
286
+ "▁applications.": 145,
287
+ "▁create": 146,
288
+ "▁datasets": 147,
289
+ "▁don’t": 148,
290
+ "▁human": 149,
291
+ "▁most": 150,
292
+ "▁my": 151,
293
+ "▁number": 152,
294
+ "▁one": 153,
295
+ "▁output": 154,
296
+ "▁performance.": 155,
297
+ "▁power": 156,
298
+ "▁pre-trained": 157,
299
+ "▁prompt": 158,
300
+ "▁save": 159,
301
+ "▁some": 160,
302
+ "▁take": 161,
303
+ "▁Ensure": 162,
304
+ "▁Hugging": 163,
305
+ "▁These": 164,
306
+ "▁at": 165,
307
+ "▁batch": 166,
308
+ "▁been": 167,
309
+ "▁before": 168,
310
+ "▁better": 169,
311
+ "▁but": 170,
312
+ "▁complex": 171,
313
+ "▁crucial": 172,
314
+ "▁each": 173,
315
+ "▁example,": 174,
316
+ "▁fine-tuning": 175,
317
+ "▁guide": 176,
318
+ "▁input": 177,
319
+ "▁involves": 178,
320
+ "▁next": 179,
321
+ "▁other": 180,
322
+ "▁outputs": 181,
323
+ "▁parallelism": 182,
324
+ "▁processing": 183,
325
+ "▁provide": 184,
326
+ "▁relevant": 185,
327
+ "▁several": 186,
328
+ "▁significant": 187,
329
+ "▁size": 188,
330
+ "▁step.": 189,
331
+ "▁techniques": 190,
332
+ "▁them": 191,
333
+ "▁tokens": 192,
334
+ "▁training.": 193,
335
+ "▁understand": 194,
336
+ "▁unique": 195,
337
+ "▁want": 196,
338
+ "▁without": 197,
339
+ "▁words": 198,
340
+ "▁(e.g.": 199,
341
+ "▁2": 200,
342
+ "▁=": 201,
343
+ "▁API": 202,
344
+ "▁If": 203,
345
+ "▁Models": 204,
346
+ "▁Prompt": 205,
347
+ "▁They": 206,
348
+ "▁Train": 207,
349
+ "▁What": 208,
350
+ "▁able": 209,
351
+ "▁across": 210,
352
+ "▁apples": 211,
353
+ "▁articles": 212,
354
+ "▁between": 213,
355
+ "▁build": 214,
356
+ "▁computational": 215,
357
+ "▁customer": 216,
358
+ "▁deep": 217,
359
+ "▁develop": 218,
360
+ "▁effective": 219,
361
+ "▁enables": 220,
362
+ "▁evaluation": 221,
363
+ "▁get": 222,
364
+ "▁information": 223,
365
+ "▁key": 224,
366
+ "▁lets": 225,
367
+ "▁me": 226,
368
+ "▁might": 227,
369
+ "▁new": 228,
370
+ "▁numbers": 229,
371
+ "▁out": 230,
372
+ "▁powerful": 231,
373
+ "▁problems": 232,
374
+ "▁process,": 233,
375
+ "▁providers": 234,
376
+ "▁question": 235,
377
+ "▁requirements": 236,
378
+ "▁resources": 237,
379
+ "▁so": 238,
380
+ "▁specialized": 239,
381
+ "▁text.": 240,
382
+ "▁then": 241,
383
+ "▁tokenization": 242,
384
+ "▁tools": 243,
385
+ "▁training,": 244,
386
+ "▁transformers": 245,
387
+ "▁understanding": 246,
388
+ "▁various": 247,
389
+ "▁was": 248,
390
+ "▁web": 249,
391
+ "▁where": 250,
392
+ "▁you’re": 251,
393
+ "▁2.": 252,
394
+ "▁4.": 253,
395
+ "▁But": 254,
396
+ "▁Claude": 255,
397
+ "▁Claude.": 256,
398
+ "▁Conclusion": 257,
399
+ "▁Environment": 258,
400
+ "▁Face": 259,
401
+ "▁However,": 260,
402
+ "▁I’m": 261,
403
+ "▁LLM.": 262,
404
+ "▁LLM?": 263,
405
+ "▁Python": 264,
406
+ "▁There": 265,
407
+ "▁ability": 266,
408
+ "▁add": 267,
409
+ "▁al.": 268,
410
+ "▁amount": 269,
411
+ "▁analyze": 270,
412
+ "▁apples.": 271,
413
+ "▁better.": 272,
414
+ "▁called": 273,
415
+ "▁chats": 274,
416
+ "▁cloud": 275,
417
+ "▁compared": 276,
418
+ "▁compromising": 277,
419
+ "▁content,": 278,
420
+ "▁conversation": 279,
421
+ "▁conversations": 280,
422
+ "▁designed": 281,
423
+ "▁directly": 282,
424
+ "▁efficient": 283,
425
+ "▁email": 284,
426
+ "▁enabling": 285,
427
+ "▁engage": 286,
428
+ "▁entire": 287,
429
+ "▁et": 288,
430
+ "▁example": 289,
431
+ "▁fine": 290,
432
+ "▁further": 291,
433
+ "▁generic": 292,
434
+ "▁gives": 293,
435
+ "▁guide,": 294,
436
+ "▁handle": 295,
437
+ "▁human-like": 296,
438
+ "▁journey": 297,
439
+ "▁libraries": 298,
440
+ "▁looking": 299,
441
+ "▁machine": 300,
442
+ "▁manual": 301,
443
+ "▁many": 302,
444
+ "▁massive": 303,
445
+ "▁models,": 304,
446
+ "▁needs.": 305,
447
+ "▁often": 306,
448
+ "▁open-source": 307,
449
+ "▁parts": 308,
450
+ "▁platform": 309,
451
+ "▁preprocessing": 310,
452
+ "▁prompts": 311,
453
+ "▁public": 312,
454
+ "▁questions": 313,
455
+ "▁research,": 314,
456
+ "▁resources,": 315,
457
+ "▁same": 316,
458
+ "▁see": 317,
459
+ "▁sensitive": 318,
460
+ "▁sequences": 319,
461
+ "▁set": 320,
462
+ "▁special": 321,
463
+ "▁still": 322,
464
+ "▁support": 323,
465
+ "▁systems": 324,
466
+ "▁task": 325,
467
+ "▁task.": 326,
468
+ "▁tasks,": 327,
469
+ "▁think": 328,
470
+ "▁though": 329,
471
+ "▁users": 330,
472
+ "▁weights": 331,
473
+ "▁well": 332,
474
+ "▁we’ll": 333,
475
+ "▁whether": 334,
476
+ "▁work": 335,
477
+ "▁would": 336,
478
+ "▁(LLMs)": 337,
479
+ "▁(e.g.,": 338,
480
+ "▁(like": 339,
481
+ "▁10": 340,
482
+ "▁15,": 341,
483
+ "▁APIs": 342,
484
+ "▁Acme": 343,
485
+ "▁After": 344,
486
+ "▁Builder": 345,
487
+ "▁Building": 346,
488
+ "▁CRM": 347,
489
+ "▁ChatGPT": 348,
490
+ "▁Cloud,": 349,
491
+ "▁Common": 350,
492
+ "▁Datasets,": 351,
493
+ "▁Deploying": 352,
494
+ "▁Encoding": 353,
495
+ "▁Evaluating": 354,
496
+ "▁Figure": 355,
497
+ "▁FrontierMath": 356,
498
+ "▁GPT-4": 357,
499
+ "▁GPT-4,": 358,
500
+ "▁GPUs,": 359,
501
+ "▁Gemini": 360,
502
+ "▁Google,": 361,
503
+ "▁Here": 362,
504
+ "▁Here’s": 363,
505
+ "▁Install": 364,
506
+ "▁Instead": 365,
507
+ "▁Kaggle": 366,
508
+ "▁Keep": 367,
509
+ "▁LLM,": 368,
510
+ "▁Let's": 369,
511
+ "▁ML": 370,
512
+ "▁NLP": 371,
513
+ "▁Number": 372,
514
+ "▁Once": 373,
515
+ "▁OpenAI,": 374,
516
+ "▁Performance": 375,
517
+ "▁Services": 376,
518
+ "▁Since": 377,
519
+ "▁TensorFlow": 378,
520
+ "▁Think": 379,
521
+ "▁Tiers": 380,
522
+ "▁Tokenization": 381,
523
+ "▁Validation": 382,
524
+ "▁Web": 383,
525
+ "▁Whether": 384,
526
+ "▁Winter": 385,
527
+ "▁With": 386,
528
+ "▁You’ll": 387,
529
+ "▁accuracy,": 388,
530
+ "▁accurate": 389,
531
+ "▁adding": 390,
532
+ "▁also": 391,
533
+ "▁answer": 392,
534
+ "▁answering": 393,
535
+ "▁applicable": 394,
536
+ "▁applications": 395,
537
+ "▁approach": 396,
538
+ "▁architecture,": 397,
539
+ "▁around": 398,
540
+ "▁articles,": 399,
541
+ "▁artificial": 400,
542
+ "▁automate": 401,
543
+ "▁available.": 402,
544
+ "▁basic": 403,
545
+ "▁biases": 404,
546
+ "▁both": 405,
547
+ "▁bought": 406,
548
+ "▁capture": 407,
549
+ "▁cases": 408,
550
+ "▁chains": 409,
551
+ "▁choose": 410,
552
+ "▁code": 411,
553
+ "▁coding": 412,
554
+ "▁combine": 413,
555
+ "▁comes": 414,
556
+ "▁common": 415,
557
+ "▁component": 416,
558
+ "▁components": 417,
559
+ "▁computing": 418,
560
+ "▁concerns": 419,
561
+ "▁consists": 420,
562
+ "▁continuous": 421,
563
+ "▁control": 422,
564
+ "▁costs": 423,
565
+ "▁customize": 424,
566
+ "▁dataset.": 425,
567
+ "▁depending": 426,
568
+ "▁determine": 427,
569
+ "▁divided": 428,
570
+ "▁documents,": 429,
571
+ "▁does": 430,
572
+ "▁domains.": 431,
573
+ "▁down": 432,
574
+ "▁effectiveness": 433,
575
+ "▁enhance": 434,
576
+ "▁ensure": 435,
577
+ "▁ensuring": 436,
578
+ "▁essential": 437,
579
+ "▁examples": 438,
580
+ "▁faster": 439,
581
+ "▁find": 440,
582
+ "▁following": 441,
583
+ "▁framework.": 442,
584
+ "▁full": 443,
585
+ "▁function": 444,
586
+ "▁generated": 445,
587
+ "▁generative": 446,
588
+ "▁goal": 447,
589
+ "▁goes": 448,
590
+ "▁grounded": 449,
591
+ "▁grounding": 450,
592
+ "▁group": 451,
593
+ "▁help": 452,
594
+ "▁helps": 453,
595
+ "▁here": 454,
596
+ "▁hyperparameters": 455,
597
+ "▁import": 456,
598
+ "▁improving": 457,
599
+ "▁include": 458,
600
+ "▁includes": 459,
601
+ "▁infrastructure": 460,
602
+ "▁install": 461,
603
+ "▁intelligence": 462,
604
+ "▁language.": 463,
605
+ "▁latest": 464,
606
+ "▁lead": 465,
607
+ "▁level": 466,
608
+ "▁local": 467,
609
+ "▁long": 468,
610
+ "▁made": 469,
611
+ "▁means": 470,
612
+ "▁metrics": 471,
613
+ "▁model’s": 472,
614
+ "▁more.": 473,
615
+ "▁multimodal": 474,
616
+ "▁natural": 475,
617
+ "▁neural": 476,
618
+ "▁odd": 477,
619
+ "▁only": 478,
620
+ "▁order": 479,
621
+ "▁over": 480,
622
+ "▁part": 481,
623
+ "▁performance,": 482,
624
+ "▁performs": 483,
625
+ "▁popular": 484,
626
+ "▁possible": 485,
627
+ "▁potential": 486,
628
+ "▁prediction": 487,
629
+ "▁privacy": 488,
630
+ "▁private": 489,
631
+ "▁prompt.": 490,
632
+ "▁prompting": 491,
633
+ "▁range": 492,
634
+ "▁ready": 493,
635
+ "▁reinforcement": 494,
636
+ "▁relationships": 495,
637
+ "▁remain": 496,
638
+ "▁required": 497,
639
+ "▁requirements.": 498,
640
+ "▁research": 499,
641
+ "▁results": 500,
642
+ "▁safety": 501,
643
+ "▁security": 502,
644
+ "▁self-attention": 503,
645
+ "▁sequence": 504,
646
+ "▁services": 505,
647
+ "▁single": 506,
648
+ "▁size,": 507,
649
+ "▁solve": 508,
650
+ "▁sources": 509,
651
+ "▁sources,": 510,
652
+ "▁step-by-step": 511,
653
+ "▁tailored": 512,
654
+ "▁tasks.": 513,
655
+ "▁team": 514,
656
+ "▁technique": 515,
657
+ "▁text,": 516,
658
+ "▁than": 517,
659
+ "▁too": 518,
660
+ "▁tool": 519,
661
+ "▁two": 520,
662
+ "▁unlock": 521,
663
+ "▁useful": 522,
664
+ "▁validation": 523,
665
+ "▁variety": 524,
666
+ "▁what": 525,
667
+ "▁word": 526,
668
+ "▁\"Let's": 527,
669
+ "▁(2022)": 528,
670
+ "▁(Hint:": 529,
671
+ "▁(LLM)": 530,
672
+ "▁(MoE)": 531,
673
+ "▁(NLP)": 532,
674
+ "▁)": 533,
675
+ "▁1:": 534,
676
+ "▁1–4": 535,
677
+ "▁2):": 536,
678
+ "▁2022:": 537,
679
+ "▁2023:": 538,
680
+ "▁2:": 539,
681
+ "▁3.": 540,
682
+ "▁5.": 541,
683
+ "▁6": 542,
684
+ "▁6.": 543,
685
+ "▁A:": 544,
686
+ "▁AI,": 545,
687
+ "▁AI.": 546,
688
+ "▁APIs.": 547,
689
+ "▁AWS,": 548,
690
+ "▁Adding": 549,
691
+ "▁Additionally,": 550,
692
+ "▁Although": 551,
693
+ "▁An": 552,
694
+ "▁Anthropic.": 553,
695
+ "▁Architecture": 554,
696
+ "▁Are": 555,
697
+ "▁At": 556,
698
+ "▁BERT": 557,
699
+ "▁Because": 558,
700
+ "▁Best": 559,
701
+ "▁CPU,": 560,
702
+ "▁Choose": 561,
703
+ "▁Claude,": 562,
704
+ "▁Consider": 563,
705
+ "▁Continuous": 564,
706
+ "▁Customize": 565,
707
+ "▁Dataset": 566,
708
+ "▁Deekshitha": 567,
709
+ "▁Drive),": 568,
710
+ "▁During": 569,
711
+ "▁Evaluation": 570,
712
+ "▁Feedback": 571,
713
+ "▁Fine-Tuning": 572,
714
+ "▁First,": 573,
715
+ "▁GPT,": 574,
716
+ "▁GPT-3": 575,
717
+ "▁GPU,": 576,
718
+ "▁ID": 577,
719
+ "▁Imagine": 578,
720
+ "▁Improvement": 579,
721
+ "▁Incognito": 580,
722
+ "▁Introduction": 581,
723
+ "▁It’s": 582,
724
+ "▁I’ve": 583,
725
+ "▁LLM-powered": 584,
726
+ "▁LLM’s": 585,
727
+ "▁Learn": 586,
728
+ "▁Learning": 587,
729
+ "▁MCP": 588,
730
+ "▁Meta,": 589,
731
+ "▁NVIDIA).": 590,
732
+ "▁Now": 591,
733
+ "▁On": 592,
734
+ "▁One": 593,
735
+ "▁OpenAI’s": 594,
736
+ "▁Optimization": 595,
737
+ "▁Practices": 596,
738
+ "▁Preparation": 597,
739
+ "▁Prerequisites": 598,
740
+ "▁Privacy": 599,
741
+ "▁Prompt:": 600,
742
+ "▁Remember,": 601,
743
+ "▁Remove": 602,
744
+ "▁Researcher": 603,
745
+ "▁Rubin": 604,
746
+ "▁Salesforce’s": 605,
747
+ "▁Services,": 606,
748
+ "▁Set": 607,
749
+ "▁Setup": 608,
750
+ "▁Some": 609,
751
+ "▁Sree": 610,
752
+ "▁Stage": 611,
753
+ "▁Summer": 612,
754
+ "▁Take": 613,
755
+ "▁Testing": 614,
756
+ "▁Text": 615,
757
+ "▁Time:": 616,
758
+ "▁Trainer,": 617,
759
+ "▁Transformer": 618,
760
+ "▁Weights": 619,
761
+ "▁While": 620,
762
+ "▁Wikipedia": 621,
763
+ "▁Write": 622,
764
+ "▁Yerra": 623,
765
+ "▁above.": 624,
766
+ "▁abstracts": 625,
767
+ "▁academic": 626,
768
+ "▁accessible": 627,
769
+ "▁according": 628,
770
+ "▁accuracy": 629,
771
+ "▁actual": 630,
772
+ "▁address)": 631,
773
+ "▁adjusting": 632,
774
+ "▁advantage": 633,
775
+ "▁after": 634,
776
+ "▁against": 635,
777
+ "▁agent,": 636,
778
+ "▁allow": 637,
779
+ "▁allowing": 638,
780
+ "▁along": 639,
781
+ "▁always": 640,
782
+ "▁amounts": 641,
783
+ "▁analysis.": 642,
784
+ "▁another": 643,
785
+ "▁answering,": 644,
786
+ "▁applying": 645,
787
+ "▁approach.": 646,
788
+ "▁approaches": 647,
789
+ "▁appropriate": 648,
790
+ "▁apps.": 649,
791
+ "▁architecture.": 650,
792
+ "▁architectures": 651,
793
+ "▁assess": 652,
794
+ "▁ate": 653,
795
+ "▁attention": 654,
796
+ "▁audience": 655,
797
+ "▁audit": 656,
798
+ "▁based": 657,
799
+ "▁because": 658,
800
+ "▁behavior,": 659,
801
+ "▁being": 660,
802
+ "▁broad": 661,
803
+ "▁calls,": 662,
804
+ "▁capabilities,": 663,
805
+ "▁capable": 664,
806
+ "▁capitalization,": 665,
807
+ "▁case": 666,
808
+ "▁case.": 667,
809
+ "▁chain-of-thought": 668,
810
+ "▁challenging": 669,
811
+ "▁characters.": 670,
812
+ "▁chips,": 671,
813
+ "▁choose.": 672,
814
+ "▁chunks": 673,
815
+ "▁clear": 674,
816
+ "▁clusters": 675,
817
+ "▁collect": 676,
818
+ "▁collected": 677,
819
+ "▁collection": 678,
820
+ "▁combining": 679,
821
+ "▁come": 680,
822
+ "▁commonly": 681,
823
+ "▁concepts": 682,
824
+ "▁conduct": 683,
825
+ "▁configuration": 684,
826
+ "▁connectors": 685,
827
+ "▁consider": 686,
828
+ "▁contact": 687,
829
+ "▁context,”": 688,
830
+ "▁continue": 689,
831
+ "▁controls": 690,
832
+ "▁conversation,": 691,
833
+ "▁conversational": 692,
834
+ "▁conversations,": 693,
835
+ "▁copied": 694,
836
+ "▁core": 695,
837
+ "▁cover": 696,
838
+ "▁creating": 697,
839
+ "▁creativity": 698,
840
+ "▁data):": 699,
841
+ "▁data:": 700,
842
+ "▁data?": 701,
843
+ "▁datasets,": 702,
844
+ "▁days,": 703,
845
+ "▁de-link": 704,
846
+ "▁decay": 705,
847
+ "▁demonstrations": 706,
848
+ "▁demonstrations.": 707,
849
+ "▁dependencies": 708,
850
+ "▁deployment.": 709,
851
+ "▁described": 710,
852
+ "▁detection": 711,
853
+ "▁detection,": 712,
854
+ "▁developers": 713,
855
+ "▁developing": 714,
856
+ "▁development": 715,
857
+ "▁difference": 716,
858
+ "▁difficult": 717,
859
+ "▁directory": 718,
860
+ "▁distributed": 719,
861
+ "▁diverse": 720,
862
+ "▁diversity": 721,
863
+ "▁documents": 722,
864
+ "▁domain": 723,
865
+ "▁domain-specific": 724,
866
+ "▁dynamic": 725,
867
+ "▁easiest": 726,
868
+ "▁easy": 727,
869
+ "▁effectively.": 728,
870
+ "▁effects": 729,
871
+ "▁emergent": 730,
872
+ "▁encoding": 731,
873
+ "▁engineers.": 732,
874
+ "▁enough": 733,
875
+ "▁enterprises": 734,
876
+ "▁epochs": 735,
877
+ "▁evaluating": 736,
878
+ "▁excel": 737,
879
+ "▁expensive": 738,
880
+ "▁experience,": 739,
881
+ "▁experimentation": 740,
882
+ "▁explicitly": 741,
883
+ "▁fake": 742,
884
+ "▁few": 743,
885
+ "▁fine-tune": 744,
886
+ "▁first.": 745,
887
+ "▁flow": 746,
888
+ "▁found": 747,
889
+ "▁foundation": 748,
890
+ "▁frameworks": 749,
891
+ "▁fundamental": 750,
892
+ "▁gateway": 751,
893
+ "▁gathering": 752,
894
+ "▁gave": 753,
895
+ "▁generation": 754,
896
+ "▁generation,": 755,
897
+ "▁generation.": 756,
898
+ "▁generators,": 757,
899
+ "▁give": 758,
900
+ "▁goals": 759,
901
+ "▁going": 760,
902
+ "▁good": 761,
903
+ "▁ground": 762,
904
+ "▁groundbreaking": 763,
905
+ "▁had": 764,
906
+ "▁half": 765,
907
+ "▁handling": 766,
908
+ "▁hardware": 767,
909
+ "▁having": 768,
910
+ "▁helped": 769,
911
+ "▁heuristics": 770,
912
+ "▁high": 771,
913
+ "▁high-quality": 772,
914
+ "▁idea": 773,
915
+ "▁images,": 774,
916
+ "▁implementing": 775,
917
+ "▁important": 776,
918
+ "▁improved": 777,
919
+ "▁improvement": 778,
920
+ "▁included": 779,
921
+ "▁incredibly": 780,
922
+ "▁industry.": 781,
923
+ "▁inference": 782,
924
+ "▁inference.": 783,
925
+ "▁information.": 784,
926
+ "▁infrastructure,": 785,
927
+ "▁initial": 786,
928
+ "▁innovation.": 787,
929
+ "▁instructions": 788,
930
+ "▁interacts": 789,
931
+ "▁interesting": 790,
932
+ "▁intermediate": 791,
933
+ "▁internal": 792,
934
+ "▁introduction": 793,
935
+ "▁investment": 794,
936
+ "▁journey.": 795,
937
+ "▁just": 796,
938
+ "▁keep": 797,
939
+ "▁known": 798,
940
+ "▁laws.": 799,
941
+ "▁layer": 800,
942
+ "▁layers": 801,
943
+ "▁leads": 802,
944
+ "▁learns": 803,
945
+ "▁legal": 804,
946
+ "▁llm": 805,
947
+ "▁load": 806,
948
+ "▁look": 807,
949
+ "▁loss": 808,
950
+ "▁lot": 809,
951
+ "▁main": 810,
952
+ "▁making": 811,
953
+ "▁manner": 812,
954
+ "▁me,": 813,
955
+ "▁measure": 814,
956
+ "▁mechanism,": 815,
957
+ "▁memory": 816,
958
+ "▁metrics.": 817,
959
+ "▁millions": 818,
960
+ "▁mistakes,": 819,
961
+ "▁model:": 820,
962
+ "▁monitor": 821,
963
+ "▁must": 822,
964
+ "▁necessary": 823,
965
+ "▁needed": 824,
966
+ "▁neighbor": 825,
967
+ "▁network,": 826,
968
+ "▁now": 827,
969
+ "▁number:": 828,
970
+ "▁objective": 829,
971
+ "▁objective.": 830,
972
+ "▁once": 831,
973
+ "▁open": 832,
974
+ "▁optimal": 833,
975
+ "▁optimized": 834,
976
+ "▁option": 835,
977
+ "▁outcomes.": 836,
978
+ "▁output,": 837,
979
+ "▁padding": 838,
980
+ "▁parameters": 839,
981
+ "▁parameters,": 840,
982
+ "▁parameters.": 841,
983
+ "▁particularly": 842,
984
+ "▁passing": 843,
985
+ "▁path": 844,
986
+ "▁perform": 845,
987
+ "▁permitted": 846,
988
+ "▁personalized": 847,
989
+ "▁pip": 848,
990
+ "▁point": 849,
991
+ "▁practical": 850,
992
+ "▁pre-training": 851,
993
+ "▁predict": 852,
994
+ "▁preferences,": 853,
995
+ "▁prepared": 854,
996
+ "▁pretrained": 855,
997
+ "▁problems,": 856,
998
+ "▁produce": 857,
999
+ "▁products": 858,
1000
+ "▁project": 859,
1001
+ "▁prompt,": 860,
1002
+ "▁proprietary": 861,
1003
+ "▁provided": 862,
1004
+ "▁provider,": 863,
1005
+ "▁provides": 864,
1006
+ "▁punctuation,": 865,
1007
+ "▁pure": 866,
1008
+ "▁quality": 867,
1009
+ "▁questions,": 868,
1010
+ "▁rate": 869,
1011
+ "▁rate,": 870,
1012
+ "▁raw": 871,
1013
+ "▁real": 872,
1014
+ "▁real-world": 873,
1015
+ "▁related": 874,
1016
+ "▁remote": 875,
1017
+ "▁removing": 876,
1018
+ "▁repeated": 877,
1019
+ "▁requires": 878,
1020
+ "▁residual": 879,
1021
+ "▁result": 880,
1022
+ "▁results,": 881,
1023
+ "▁review": 882,
1024
+ "▁right": 883,
1025
+ "▁satisfies": 884,
1026
+ "▁scope": 885,
1027
+ "▁scratch": 886,
1028
+ "▁second": 887,
1029
+ "▁sent": 888,
1030
+ "▁sentiment": 889,
1031
+ "▁sequence.": 890,
1032
+ "▁servers,": 891,
1033
+ "▁sessions": 892,
1034
+ "▁set.": 893,
1035
+ "▁share": 894,
1036
+ "▁should": 895,
1037
+ "▁show": 896,
1038
+ "▁showing": 897,
1039
+ "▁significantly": 898,
1040
+ "▁solutions": 899,
1041
+ "▁solutions.": 900,
1042
+ "▁something": 901,
1043
+ "▁sophisticated": 902,
1044
+ "▁speaks": 903,
1045
+ "▁splits": 904,
1046
+ "▁started": 905,
1047
+ "▁starting": 906,
1048
+ "▁step\"": 907,
1049
+ "▁steps,": 908,
1050
+ "▁storing": 909,
1051
+ "▁strategies": 910,
1052
+ "▁study": 911,
1053
+ "▁styles": 912,
1054
+ "▁suggests": 913,
1055
+ "▁support.": 914,
1056
+ "▁switch,": 915,
1057
+ "▁system": 916,
1058
+ "▁takes": 917,
1059
+ "▁taking": 918,
1060
+ "▁teams": 919,
1061
+ "▁terms": 920,
1062
+ "▁test": 921,
1063
+ "▁things": 922,
1064
+ "▁this.": 923,
1065
+ "▁thorough": 924,
1066
+ "▁those": 925,
1067
+ "▁thumbs": 926,
1068
+ "▁time.": 927,
1069
+ "▁times": 928,
1070
+ "▁to)": 929,
1071
+ "▁to.": 930,
1072
+ "▁today": 931,
1073
+ "▁token": 932,
1074
+ "▁tokens,": 933,
1075
+ "▁tokens.": 934,
1076
+ "▁toxicity": 935,
1077
+ "▁trail": 936,
1078
+ "▁train-your-own-model": 937,
1079
+ "▁transformed": 938,
1080
+ "▁translation": 939,
1081
+ "▁trial-and-error,": 940,
1082
+ "▁try": 941,
1083
+ "▁tune": 942,
1084
+ "▁tuning": 943,
1085
+ "▁type": 944,
1086
+ "▁types": 945,
1087
+ "▁under": 946,
1088
+ "▁understand,": 947,
1089
+ "▁updated": 948,
1090
+ "▁us": 949,
1091
+ "▁use,": 950,
1092
+ "▁useful.": 951,
1093
+ "▁usually": 952,
1094
+ "▁versatile": 953,
1095
+ "▁via": 954,
1096
+ "▁websites,": 955,
1097
+ "▁weight": 956,
1098
+ "▁went": 957,
1099
+ "▁who": 958,
1100
+ "▁words,": 959,
1101
+ "▁world": 960,
1102
+ "▁write": 961,
1103
+ "▁writing": 962,
1104
+ "▁yet": 963,
1105
+ "▁“in": 964,
1106
+ "▁#1": 965,
1107
+ "▁#2": 966,
1108
+ "▁#can": 967,
1109
+ "▁$324,573": 968,
1110
+ "▁$357,542": 969,
1111
+ "▁$375,286": 970,
1112
+ "▁$388,852": 971,
1113
+ "▁$402,255": 972,
1114
+ "▁(2022),": 973,
1115
+ "▁(9,": 974,
1116
+ "▁(AI)": 975,
1117
+ "▁(AI),": 976,
1118
+ "▁(AWS).": 977,
1119
+ "▁(Auto-CoT)": 978,
1120
+ "▁(BPE),": 979,
1121
+ "▁(CoT)": 980,
1122
+ "▁(GPU": 981,
1123
+ "▁(Kojima": 982,
1124
+ "▁(LLM)?": 983,
1125
+ "▁(LLMs).": 984,
1126
+ "▁(ML)": 985,
1127
+ "▁(More": 986,
1128
+ "▁(PII)": 987,
1129
+ "▁(RLHF)": 988,
1130
+ "▁(SGD)": 989,
1131
+ "▁(Sales,": 990,
1132
+ "▁(Sweeps)": 991,
1133
+ "▁(decrease": 992,
1134
+ "▁(from": 993,
1135
+ "▁(graphics": 994,
1136
+ "▁(grounded": 995,
1137
+ "▁(grouping": 996,
1138
+ "▁(in": 997,
1139
+ "▁(including": 998,
1140
+ "▁(making": 999,
1141
+ "▁(not": 1000,
1142
+ "▁(see": 1001,
1143
+ "▁(splitting": 1002,
1144
+ "▁(train": 1003,
1145
+ "▁(using": 1004,
1146
+ "▁(without": 1005,
1147
+ "▁(writing": 1006,
1148
+ "▁/": 1007,
1149
+ "▁1": 1008,
1150
+ "▁1)": 1009,
1151
+ "▁1):": 1010,
1152
+ "▁1-3": 1011,
1153
+ "▁11": 1012,
1154
+ "▁12,": 1013,
1155
+ "▁13": 1014,
1156
+ "▁13,": 1015,
1157
+ "▁17B": 1016,
1158
+ "▁2,": 1017,
1159
+ "▁2021.": 1018,
1160
+ "▁2022)": 1019,
1161
+ "▁2024:": 1020,
1162
+ "▁2026.6-layer.": 1021,
1163
+ "▁25.": 1022,
1164
+ "▁32,": 1023,
1165
+ "▁397B": 1024,
1166
+ "▁3:": 1025,
1167
+ "▁4": 1026,
1168
+ "▁4,": 1027,
1169
+ "▁4:": 1028,
1170
+ "▁5,": 1029,
1171
+ "▁5:": 1030,
1172
+ "▁6-layer": 1031,
1173
+ "▁60": 1032,
1174
+ "▁6:": 1033,
1175
+ "▁7,": 1034,
1176
+ "▁7.": 1035,
1177
+ "▁7:": 1036,
1178
+ "▁8,": 1037,
1179
+ "▁8.": 1038,
1180
+ "▁82,": 1039,
1181
+ "▁9,": 1040,
1182
+ "▁9.": 1041,
1183
+ "▁:": 1042,
1184
+ "▁ACME.": 1043,
1185
+ "▁AI:": 1044,
1186
+ "▁API,": 1045,
1187
+ "▁APIs),": 1046,
1188
+ "▁APIs?": 1047,
1189
+ "▁APIs”": 1048,
1190
+ "▁AWS": 1049,
1191
+ "▁AWS.2": 1050,
1192
+ "▁Account": 1051,
1193
+ "▁Adjusting": 1052,
1194
+ "▁Adventure.": 1053,
1195
+ "▁Alibaba’s": 1054,
1196
+ "▁Always": 1055,
1197
+ "▁Amazon": 1056,
1198
+ "▁Among": 1057,
1199
+ "▁Analyze": 1058,
1200
+ "▁And": 1059,
1201
+ "▁Another": 1060,
1202
+ "▁Anthropic": 1061,
1203
+ "▁Anthropic,": 1062,
1204
+ "▁Anthropic’s": 1063,
1205
+ "▁Any": 1064,
1206
+ "▁Apex": 1065,
1207
+ "▁Applications": 1066,
1208
+ "▁Architecture:": 1067,
1209
+ "▁Artificial": 1068,
1210
+ "▁As": 1069,
1211
+ "▁Assemble": 1070,
1212
+ "▁Audit": 1071,
1213
+ "▁Auto-CoT": 1072,
1214
+ "▁Auto-CoT,": 1073,
1215
+ "▁AutoTokenizer": 1074,
1216
+ "▁AutoTokenizer.from_pretrained('gpt-4')": 1075,
1217
+ "▁Automatic": 1076,
1218
+ "▁Azure": 1077,
1219
+ "▁BERT,": 1078,
1220
+ "▁BLEU": 1079,
1221
+ "▁BLEU,": 1080,
1222
+ "▁Balancing": 1081,
1223
+ "▁Based": 1082,
1224
+ "▁Basic": 1083,
1225
+ "▁Batch": 1084,
1226
+ "▁Batching": 1085,
1227
+ "▁Be": 1086,
1228
+ "▁Before": 1087,
1229
+ "▁Better": 1088,
1230
+ "▁Biases": 1089,
1231
+ "▁Biases,": 1090,
1232
+ "▁Bigger": 1091,
1233
+ "▁Blackwell": 1092,
1234
+ "▁BlueField-4": 1093,
1235
+ "▁Bypass": 1094,
1236
+ "▁Byte": 1095,
1237
+ "▁CCPA,": 1096,
1238
+ "▁CEO": 1097,
1239
+ "▁CEO.": 1098,
1240
+ "▁Campaign": 1099,
1241
+ "▁Chain-of-Thought": 1100,
1242
+ "▁Chat": 1101,
1243
+ "▁Chrome.": 1102,
1244
+ "▁Cleaning:": 1103,
1245
+ "▁CoT": 1104,
1246
+ "▁Code.": 1105,
1247
+ "▁Cohere,": 1106,
1248
+ "▁Colab": 1107,
1249
+ "▁Collect": 1108,
1250
+ "▁Commerce),": 1109,
1251
+ "▁Community": 1110,
1252
+ "▁Computational": 1111,
1253
+ "▁Considerations:": 1112,
1254
+ "▁Consistently": 1113,
1255
+ "▁Contents": 1114,
1256
+ "▁Convert": 1115,
1257
+ "▁Cost": 1116,
1258
+ "▁Crawl": 1117,
1259
+ "▁Crawl,": 1118,
1260
+ "▁Creating": 1119,
1261
+ "▁Custom": 1120,
1262
+ "▁DPU,": 1121,
1263
+ "▁Data.gov": 1122,
1264
+ "▁Dead:": 1123,
1265
+ "▁Decide": 1124,
1266
+ "▁Decrease": 1125,
1267
+ "▁DeepSeek-R1": 1126,
1268
+ "▁Define": 1127,
1269
+ "▁Demasking": 1128,
1270
+ "▁Demasking,": 1129,
1271
+ "▁Depending": 1130,
1272
+ "▁Deploy": 1131,
1273
+ "▁Deployment": 1132,
1274
+ "▁Detection": 1133,
1275
+ "▁Detectors": 1134,
1276
+ "▁Determine": 1135,
1277
+ "▁Determining": 1136,
1278
+ "▁DevOps": 1137,
1279
+ "▁DevOps,": 1138,
1280
+ "▁Developer": 1139,
1281
+ "▁Developers": 1140,
1282
+ "▁Different": 1141,
1283
+ "▁Difficulty": 1142,
1284
+ "▁Domain-specific": 1143,
1285
+ "▁Don’t": 1144,
1286
+ "▁Download": 1145,
1287
+ "▁Drawing": 1146,
1288
+ "▁Each": 1147,
1289
+ "▁Edge,": 1148,
1290
+ "▁Edition": 1149,
1291
+ "▁Eliminate": 1150,
1292
+ "▁Elite,": 1151,
1293
+ "▁Engage": 1152,
1294
+ "▁English": 1153,
1295
+ "▁Ethernet": 1154,
1296
+ "▁Ethical": 1155,
1297
+ "▁Evaluate": 1156,
1298
+ "▁Everyone": 1157,
1299
+ "▁Experimenting": 1158,
1300
+ "▁Expertise:": 1159,
1301
+ "▁F1-score": 1160,
1302
+ "▁FAQs": 1161,
1303
+ "▁FP8": 1162,
1304
+ "▁Face)": 1163,
1305
+ "▁Face.": 1164,
1306
+ "▁False.": 1165,
1307
+ "▁Familiarity": 1166,
1308
+ "▁FastAPI": 1167,
1309
+ "▁Feed": 1168,
1310
+ "▁Feed-forward": 1169,
1311
+ "▁Feel": 1170,
1312
+ "▁Final": 1171,
1313
+ "▁Finally,": 1172,
1314
+ "▁Fine": 1173,
1315
+ "▁Fine-tune": 1174,
1316
+ "▁Flask": 1175,
1317
+ "▁Follow": 1176,
1318
+ "▁Framework": 1177,
1319
+ "▁Free,": 1178,
1320
+ "▁Freeze": 1179,
1321
+ "▁From": 1180,
1322
+ "▁GDPR": 1181,
1323
+ "▁GPT": 1182,
1324
+ "▁GPT-3,": 1183,
1325
+ "▁GPT-3.": 1184,
1326
+ "▁GPT2LMHeadModel": 1185,
1327
+ "▁GPT2LMHeadModel.from_pretrained('gpt-4')": 1186,
1328
+ "▁GPU.": 1187,
1329
+ "▁GPUs": 1188,
1330
+ "▁GUI": 1189,
1331
+ "▁Game": 1190,
1332
+ "▁Getting": 1191,
1333
+ "▁GitHub": 1192,
1334
+ "▁Given": 1193,
1335
+ "▁Google.": 1194,
1336
+ "▁Google’s": 1195,
1337
+ "▁Graph": 1196,
1338
+ "▁Happy": 1197,
1339
+ "▁Hard": 1198,
1340
+ "▁Have": 1199,
1341
+ "▁Hyperparameter": 1200,
1342
+ "▁Hyperparameters": 1201,
1343
+ "▁I'm": 1202,
1344
+ "▁Image": 1203,
1345
+ "▁Implementing": 1204,
1346
+ "▁Improve": 1205,
1347
+ "▁Infrastructure": 1206,
1348
+ "▁Input": 1207,
1349
+ "▁Instead,": 1208,
1350
+ "▁Integrating": 1209,
1351
+ "▁Intelligence": 1210,
1352
+ "▁Introduced": 1211,
1353
+ "▁Involvement:": 1212,
1354
+ "▁IoT,": 1213,
1355
+ "▁Is": 1214,
1356
+ "▁It's": 1215,
1357
+ "▁Kojima": 1216,
1358
+ "▁LLM!": 1217,
1359
+ "▁LLM:": 1218,
1360
+ "▁LLMs,": 1219,
1361
+ "▁LaMDA": 1220,
1362
+ "▁Larger": 1221,
1363
+ "▁Layer’s": 1222,
1364
+ "▁Leading": 1223,
1365
+ "▁Learning:": 1224,
1366
+ "▁Libraries:": 1225,
1367
+ "▁LinkedIn": 1226,
1368
+ "▁Lisa": 1227,
1369
+ "▁Llama": 1228,
1370
+ "▁Llama,": 1229,
1371
+ "▁Long": 1230,
1372
+ "▁Loop:": 1231,
1373
+ "▁Loss": 1232,
1374
+ "▁Lower": 1233,
1375
+ "▁Luckily,": 1234,
1376
+ "▁Machine": 1235,
1377
+ "▁Major": 1236,
1378
+ "▁Make": 1237,
1379
+ "▁Making": 1238,
1380
+ "▁Marketing": 1239,
1381
+ "▁Marketing,": 1240,
1382
+ "▁Martinez,": 1241,
1383
+ "▁Max": 1242,
1384
+ "▁Medical": 1243,
1385
+ "▁Mentor,": 1244,
1386
+ "▁Meta": 1245,
1387
+ "▁Metrics": 1246,
1388
+ "▁Metrics:": 1247,
1389
+ "▁Microsoft,": 1248,
1390
+ "▁Mixture-of-Experts": 1249,
1391
+ "▁MoE": 1250,
1392
+ "▁Model.": 1251,
1393
+ "▁Model:": 1252,
1394
+ "▁Model?": 1253,
1395
+ "▁Most": 1254,
1396
+ "▁Multi-head": 1255,
1397
+ "▁NLP.": 1256,
1398
+ "▁NVIDIA": 1257,
1399
+ "▁NVLink": 1258,
1400
+ "▁Natural": 1259,
1401
+ "▁New": 1260,
1402
+ "▁Nobody": 1261,
1403
+ "▁Normalization": 1262,
1404
+ "▁Northern": 1263,
1405
+ "▁Objective": 1264,
1406
+ "▁Open-source": 1265,
1407
+ "▁OpenAI": 1266,
1408
+ "▁OpenML,": 1267,
1409
+ "▁Our": 1268,
1410
+ "▁Outfitters.": 1269,
1411
+ "▁Output:": 1270,
1412
+ "▁Own": 1271,
1413
+ "▁PaLM": 1272,
1414
+ "▁Pair": 1273,
1415
+ "▁Peak,": 1274,
1416
+ "▁Perplexity:": 1275,
1417
+ "▁Pipeline": 1276,
1418
+ "▁Platform": 1277,
1419
+ "▁Plenty": 1278,
1420
+ "▁Policy,": 1279,
1421
+ "▁Popular": 1280,
1422
+ "▁Pre-trained": 1281,
1423
+ "▁Pre-training": 1282,
1424
+ "▁Prepare": 1283,
1425
+ "▁Preparing": 1284,
1426
+ "▁Preprocess": 1285,
1427
+ "▁Preprocessing": 1286,
1428
+ "▁Preprocessing:": 1287,
1429
+ "▁Pretrained": 1288,
1430
+ "▁Pro,": 1289,
1431
+ "▁Processing": 1290,
1432
+ "▁Program)": 1291,
1433
+ "▁Public": 1292,
1434
+ "▁PyTorch": 1293,
1435
+ "▁PyTorch,": 1294,
1436
+ "▁PyTorch.": 1295,
1437
+ "▁Python:": 1296,
1438
+ "▁Question": 1297,
1439
+ "▁Qwen3.5": 1298,
1440
+ "▁Qwen3.5,": 1299,
1441
+ "▁Qwen3.5-397B-A17B,": 1300,
1442
+ "▁RAM,": 1301,
1443
+ "▁RL": 1302,
1444
+ "▁RLHF,": 1303,
1445
+ "▁ROUGE:": 1304,
1446
+ "▁Recently,": 1305,
1447
+ "▁Recognize": 1306,
1448
+ "▁Record": 1307,
1449
+ "▁RefinedWeb": 1308,
1450
+ "▁Reflecting": 1309,
1451
+ "▁Regularization": 1310,
1452
+ "▁Regularly": 1311,
1453
+ "▁Reinforcement": 1312,
1454
+ "▁Representative": 1313,
1455
+ "▁Required": 1314,
1456
+ "▁Required:": 1315,
1457
+ "▁Resources:": 1316,
1458
+ "▁Rubin-based": 1317,
1459
+ "▁Rubin-era": 1318,
1460
+ "▁SVP": 1319,
1461
+ "▁Safeguard": 1320,
1462
+ "▁Safeguards": 1321,
1463
+ "▁Salesforce.": 1322,
1464
+ "▁Savings": 1323,
1465
+ "▁Schedule": 1324,
1466
+ "▁Search,": 1325,
1467
+ "▁Secure": 1326,
1468
+ "▁Security": 1327,
1469
+ "▁Select": 1328,
1470
+ "▁Sentiment": 1329,
1471
+ "▁Service,": 1330,
1472
+ "▁Serving": 1331,
1473
+ "▁Setting": 1332,
1474
+ "▁Settings.": 1333,
1475
+ "▁Setup:": 1334,
1476
+ "▁Should": 1335,
1477
+ "▁Simplify": 1336,
1478
+ "▁Skills": 1337,
1479
+ "▁Smaller": 1338,
1480
+ "▁Smith,": 1339,
1481
+ "▁So,": 1340,
1482
+ "▁Source:": 1341,
1483
+ "▁Sources:": 1342,
1484
+ "▁Sourcing": 1343,
1485
+ "▁Speaker,": 1344,
1486
+ "▁Spectrum-6": 1345,
1487
+ "▁Split": 1346,
1488
+ "▁Starting": 1347,
1489
+ "▁State": 1348,
1490
+ "▁Stay": 1349,
1491
+ "▁Steps": 1350,
1492
+ "▁Stochastic": 1351,
1493
+ "▁Strategic": 1352,
1494
+ "▁SuperNIC,": 1353,
1495
+ "▁Support:": 1354,
1496
+ "▁T5": 1355,
1497
+ "▁T5.": 1356,
1498
+ "▁Table": 1357,
1499
+ "▁Tailored": 1358,
1500
+ "▁Task-specific": 1359,
1501
+ "▁Technical": 1360,
1502
+ "▁Techniques:": 1361,
1503
+ "▁Tensor": 1362,
1504
+ "▁Tenth": 1363,
1505
+ "▁Tester": 1364,
1506
+ "▁That’s": 1365,
1507
+ "▁Then": 1366,
1508
+ "▁Then,": 1367,
1509
+ "▁Thoughts": 1368,
1510
+ "▁Tier": 1369,
1511
+ "▁Tokenization.": 1370,
1512
+ "▁Tokenization:": 1371,
1513
+ "▁Tokenize": 1372,
1514
+ "▁Tokens": 1373,
1515
+ "▁Toxicity": 1374,
1516
+ "▁Track": 1375,
1517
+ "▁Trail": 1376,
1518
+ "▁Trainer(": 1377,
1519
+ "▁TrainingArguments": 1378,
1520
+ "▁TrainingArguments(": 1379,
1521
+ "▁Transformers": 1380,
1522
+ "▁Transformers,": 1381,
1523
+ "▁Translation": 1382,
1524
+ "▁Trusted": 1383,
1525
+ "▁Truths": 1384,
1526
+ "▁Tuning": 1385,
1527
+ "▁Tuning:": 1386,
1528
+ "▁Understanding": 1387,
1529
+ "▁Undetectable": 1388,
1530
+ "▁Up": 1389,
1531
+ "▁Usage": 1390,
1532
+ "▁Useful": 1391,
1533
+ "▁Using": 1392,
1534
+ "▁Validation:": 1393,
1535
+ "▁Varying": 1394,
1536
+ "▁Vera": 1395,
1537
+ "▁Verify": 1396,
1538
+ "▁Vision": 1397,
1539
+ "▁Weave": 1398,
1540
+ "▁Wei": 1399,
1541
+ "▁Why": 1400,
1542
+ "▁Why?": 1401,
1543
+ "▁Word": 1402,
1544
+ "▁WordPiece": 1403,
1545
+ "▁Work": 1404,
1546
+ "▁Wow!": 1405,
1547
+ "▁abilities": 1406,
1548
+ "▁abilities.": 1407,
1549
+ "▁above": 1408,
1550
+ "▁access.": 1409,
1551
+ "▁account": 1410,
1552
+ "▁accounts": 1411,
1553
+ "▁achieving": 1412,
1554
+ "▁acquired": 1413,
1555
+ "▁act": 1414,
1556
+ "▁activated": 1415,
1557
+ "▁adapt": 1416,
1558
+ "▁adaptation.": 1417,
1559
+ "▁added": 1418,
1560
+ "▁added,": 1419,
1561
+ "▁additional": 1420,
1562
+ "▁adjacent": 1421,
1563
+ "▁adjusts": 1422,
1564
+ "▁adopt": 1423,
1565
+ "▁advancements": 1424,
1566
+ "▁advantages": 1425,
1567
+ "▁affect": 1426,
1568
+ "▁agreements": 1427,
1569
+ "▁aiming": 1428,
1570
+ "▁algorithm": 1429,
1571
+ "▁algorithms": 1430,
1572
+ "▁aligning": 1431,
1573
+ "▁although": 1432,
1574
+ "▁amounts:": 1433,
1575
+ "▁analysis": 1434,
1576
+ "▁anymore.": 1435,
1577
+ "▁app,": 1436,
1578
+ "▁appears": 1437,
1579
+ "▁apple,": 1438,
1580
+ "▁apples,": 1439,
1581
+ "▁application": 1440,
1582
+ "▁applications,": 1441,
1583
+ "▁applications?": 1442,
1584
+ "▁approach:": 1443,
1585
+ "▁approach?": 1444,
1586
+ "▁approximately": 1445,
1587
+ "▁architected": 1446,
1588
+ "▁architectures,": 1447,
1589
+ "▁are:": 1448,
1590
+ "▁areas": 1449,
1591
+ "▁args=training_args,": 1450,
1592
+ "▁arguments,": 1451,
1593
+ "▁arise": 1452,
1594
+ "▁arises": 1453,
1595
+ "▁article": 1454,
1596
+ "▁article,": 1455,
1597
+ "▁as:": 1456,
1598
+ "▁ask": 1457,
1599
+ "▁asking": 1458,
1600
+ "▁assigns": 1459,
1601
+ "▁assistance": 1460,
1602
+ "▁assistant": 1461,
1603
+ "▁associated": 1462,
1604
+ "▁attempt": 1463,
1605
+ "▁attend": 1464,
1606
+ "▁attention.": 1465,
1607
+ "▁audio,": 1466,
1608
+ "▁auditing": 1467,
1609
+ "▁authentication": 1468,
1610
+ "▁authorized": 1469,
1611
+ "▁authors": 1470,
1612
+ "▁automatic": 1471,
1613
+ "▁automating": 1472,
1614
+ "▁availability": 1473,
1615
+ "▁availability,": 1474,
1616
+ "▁available": 1475,
1617
+ "▁away": 1476,
1618
+ "▁back-end": 1477,
1619
+ "▁batching": 1478,
1620
+ "▁be.": 1479,
1621
+ "▁become": 1480,
1622
+ "▁becomes": 1481,
1623
+ "▁before.": 1482,
1624
+ "▁belongs": 1483,
1625
+ "▁below.": 1484,
1626
+ "▁below:": 1485,
1627
+ "▁benchmark": 1486,
1628
+ "▁beneficial:": 1487,
1629
+ "▁benefits": 1488,
1630
+ "▁best": 1489,
1631
+ "▁beyond": 1490,
1632
+ "▁beyond.": 1491,
1633
+ "▁bias": 1492,
1634
+ "▁biggest": 1493,
1635
+ "▁billions": 1494,
1636
+ "▁bind": 1495,
1637
+ "▁block": 1496,
1638
+ "▁blocks": 1497,
1639
+ "▁blog": 1498,
1640
+ "▁books,": 1499,
1641
+ "▁brand": 1500,
1642
+ "▁break": 1501,
1643
+ "▁breaking": 1502,
1644
+ "▁builders": 1503,
1645
+ "▁building": 1504,
1646
+ "▁built": 1505,
1647
+ "▁business": 1506,
1648
+ "▁button": 1507,
1649
+ "▁button,": 1508,
1650
+ "▁buys": 1509,
1651
+ "▁calls": 1510,
1652
+ "▁calls.": 1511,
1653
+ "▁came": 1512,
1654
+ "▁can,": 1513,
1655
+ "▁capabilities": 1514,
1656
+ "▁capabilities.": 1515,
1657
+ "▁captures": 1516,
1658
+ "▁carefully": 1517,
1659
+ "▁cases,": 1518,
1660
+ "▁cases.": 1519,
1661
+ "▁cater": 1520,
1662
+ "▁certain": 1521,
1663
+ "▁chain": 1522,
1664
+ "▁chains.": 1523,
1665
+ "▁challenge": 1524,
1666
+ "▁changed": 1525,
1667
+ "▁changing": 1526,
1668
+ "▁characteristics": 1527,
1669
+ "▁characters,": 1528,
1670
+ "▁chart": 1529,
1671
+ "▁chatbot": 1530,
1672
+ "▁chatbot?": 1531,
1673
+ "▁chatbots": 1532,
1674
+ "▁chatbots,": 1533,
1675
+ "▁chats.": 1534,
1676
+ "▁checks": 1535,
1677
+ "▁chip": 1536,
1678
+ "▁chips.": 1537,
1679
+ "▁choosing": 1538,
1680
+ "▁chunks).": 1539,
1681
+ "▁chunks.": 1540,
1682
+ "▁claim": 1541,
1683
+ "▁classification": 1542,
1684
+ "▁claude@gmail.com": 1543,
1685
+ "▁clean": 1544,
1686
+ "▁cleaned": 1545,
1687
+ "▁cleaner.": 1546,
1688
+ "▁click.": 1547,
1689
+ "▁clone": 1548,
1690
+ "▁cloud,": 1549,
1691
+ "▁cluster": 1550,
1692
+ "▁clustering:": 1551,
1693
+ "▁code,": 1552,
1694
+ "▁code.": 1553,
1695
+ "▁codesign": 1554,
1696
+ "▁collaborate": 1555,
1697
+ "▁combination": 1556,
1698
+ "▁comments": 1557,
1699
+ "▁commercial": 1558,
1700
+ "▁committed": 1559,
1701
+ "▁committing": 1560,
1702
+ "▁common.": 1561,
1703
+ "▁communication": 1562,
1704
+ "▁communities": 1563,
1705
+ "▁company,": 1564,
1706
+ "▁company.": 1565,
1707
+ "▁company?": 1566,
1708
+ "▁compare": 1567,
1709
+ "▁comparing": 1568,
1710
+ "▁compass": 1569,
1711
+ "▁compelling": 1570,
1712
+ "▁complete": 1571,
1713
+ "▁completely": 1572,
1714
+ "▁complexity": 1573,
1715
+ "▁compliance.": 1574,
1716
+ "▁compliance:": 1575,
1717
+ "▁complicated": 1576,
1718
+ "▁complies": 1577,
1719
+ "▁comply": 1578,
1720
+ "▁components.": 1579,
1721
+ "▁comprehend": 1580,
1722
+ "▁comprehension.": 1581,
1723
+ "▁comprehensive": 1582,
1724
+ "▁computationally": 1583,
1725
+ "▁compute,": 1584,
1726
+ "▁computer,": 1585,
1727
+ "▁conducting": 1586,
1728
+ "▁configure": 1587,
1729
+ "▁configured": 1588,
1730
+ "▁configuring": 1589,
1731
+ "▁connect": 1590,
1732
+ "▁connection": 1591,
1733
+ "▁connections": 1592,
1734
+ "▁consistency,": 1593,
1735
+ "▁consistency.": 1594,
1736
+ "▁consistent": 1595,
1737
+ "▁consistently": 1596,
1738
+ "▁construct": 1597,
1739
+ "▁consumer": 1598,
1740
+ "▁consumption,": 1599,
1741
+ "▁containerization": 1600,
1742
+ "▁contains": 1601,
1743
+ "▁content!": 1602,
1744
+ "▁context": 1603,
1745
+ "▁contexts": 1604,
1746
+ "▁contextual": 1605,
1747
+ "▁continues": 1606,
1748
+ "▁continuing": 1607,
1749
+ "▁continuously": 1608,
1750
+ "▁convergence": 1609,
1751
+ "▁converges": 1610,
1752
+ "▁convert": 1611,
1753
+ "▁converting": 1612,
1754
+ "▁copy": 1613,
1755
+ "▁copyright": 1614,
1756
+ "▁cornerstone": 1615,
1757
+ "▁costs.": 1616,
1758
+ "▁countless": 1617,
1759
+ "▁covered": 1618,
1760
+ "▁crafted": 1619,
1761
+ "▁created": 1620,
1762
+ "▁creates": 1621,
1763
+ "▁creation": 1622,
1764
+ "▁creator": 1623,
1765
+ "▁critical": 1624,
1766
+ "▁criticism": 1625,
1767
+ "▁crowded": 1626,
1768
+ "▁crucial.": 1627,
1769
+ "▁curated": 1628,
1770
+ "▁curves": 1629,
1771
+ "▁customization,": 1630,
1772
+ "▁cut": 1631,
1773
+ "▁cutting": 1632,
1774
+ "▁data)": 1633,
1775
+ "▁database": 1634,
1776
+ "▁dataset,": 1635,
1777
+ "▁datasets.": 1636,
1778
+ "▁datasets.1": 1637,
1779
+ "▁datasets:": 1638,
1780
+ "▁daunting": 1639,
1781
+ "▁deal": 1640,
1782
+ "▁decide": 1641,
1783
+ "▁decision": 1642,
1784
+ "▁decision.": 1643,
1785
+ "▁decisions": 1644,
1786
+ "▁decrease": 1645,
1787
+ "▁deeper": 1646,
1788
+ "▁default": 1647,
1789
+ "▁defined": 1648,
1790
+ "▁defining": 1649,
1791
+ "▁deliver": 1650,
1792
+ "▁demands": 1651,
1793
+ "▁demasking,": 1652,
1794
+ "▁demonstration": 1653,
1795
+ "▁demonstrations,": 1654,
1796
+ "▁dense": 1655,
1797
+ "▁dependencies.": 1656,
1798
+ "▁deployed,": 1657,
1799
+ "▁deployed.": 1658,
1800
+ "▁deployment": 1659,
1801
+ "▁derived": 1660,
1802
+ "▁descent": 1661,
1803
+ "▁describe": 1662,
1804
+ "▁describes": 1663,
1805
+ "▁design": 1664,
1806
+ "▁desired": 1665,
1807
+ "▁detect": 1666,
1808
+ "▁detected": 1667,
1809
+ "▁determines": 1668,
1810
+ "▁development,": 1669,
1811
+ "▁development.": 1670,
1812
+ "▁diagnosis": 1671,
1813
+ "▁did": 1672,
1814
+ "▁differences": 1673,
1815
+ "▁difficulties": 1674,
1816
+ "▁direct": 1675,
1817
+ "▁disadvantages.": 1676,
1818
+ "▁discussed": 1677,
1819
+ "▁discussions.": 1678,
1820
+ "▁diverges": 1679,
1821
+ "▁diverse,": 1680,
1822
+ "▁diverse.": 1681,
1823
+ "▁dividing": 1682,
1824
+ "▁do,": 1683,
1825
+ "▁documentation,": 1684,
1826
+ "▁does.": 1685,
1827
+ "▁doesn’t": 1686,
1828
+ "▁doing": 1687,
1829
+ "▁dollars.": 1688,
1830
+ "▁domain-specific,": 1689,
1831
+ "▁domain.": 1690,
1832
+ "▁don't": 1691,
1833
+ "▁downside": 1692,
1834
+ "▁draft": 1693,
1835
+ "▁drawbacks,": 1694,
1836
+ "▁dropout": 1695,
1837
+ "▁duplicates": 1696,
1838
+ "▁duplicates,": 1697,
1839
+ "▁during": 1698,
1840
+ "▁dynamically": 1699,
1841
+ "▁earlier": 1700,
1842
+ "▁earlier.": 1701,
1843
+ "▁early": 1702,
1844
+ "▁easily": 1703,
1845
+ "▁economical": 1704,
1846
+ "▁economics": 1705,
1847
+ "▁efficiency,": 1706,
1848
+ "▁efficiency:": 1707,
1849
+ "▁efficient.": 1708,
1850
+ "▁efficiently.": 1709,
1851
+ "▁effort": 1710,
1852
+ "▁efforts": 1711,
1853
+ "▁elements": 1712,
1854
+ "▁elevate": 1713,
1855
+ "▁eliminate": 1714,
1856
+ "▁else": 1715,
1857
+ "▁embedding": 1716,
1858
+ "▁embedding,": 1717,
1859
+ "▁embeddings": 1718,
1860
+ "▁embeddings,": 1719,
1861
+ "▁emerged": 1720,
1862
+ "▁emergence": 1721,
1863
+ "▁emphasize": 1722,
1864
+ "▁enable": 1723,
1865
+ "▁enabled": 1724,
1866
+ "▁encoding.": 1725,
1867
+ "▁encourages": 1726,
1868
+ "▁end": 1727,
1869
+ "▁end,": 1728,
1870
+ "▁endeavor.": 1729,
1871
+ "▁enforce": 1730,
1872
+ "▁engineers": 1731,
1873
+ "▁engineers,": 1732,
1874
+ "▁enough:": 1733,
1875
+ "▁enriched": 1734,
1876
+ "▁ensured": 1735,
1877
+ "▁ensures": 1736,
1878
+ "▁entail": 1737,
1879
+ "▁entails": 1738,
1880
+ "▁enter": 1739,
1881
+ "▁enterprise": 1740,
1882
+ "▁enterprises,": 1741,
1883
+ "▁entirely,": 1742,
1884
+ "▁environment": 1743,
1885
+ "▁environment,": 1744,
1886
+ "▁environment.": 1745,
1887
+ "▁epochs.": 1746,
1888
+ "▁equipped": 1747,
1889
+ "▁errors.": 1748,
1890
+ "▁essence": 1749,
1891
+ "▁essentially": 1750,
1892
+ "▁establish": 1751,
1893
+ "▁etc.": 1752,
1894
+ "▁etc.)": 1753,
1895
+ "▁etc...": 1754,
1896
+ "▁ethical": 1755,
1897
+ "▁ethically": 1756,
1898
+ "▁eval_dataset=eval_dataset": 1757,
1899
+ "▁evaluate": 1758,
1900
+ "▁evaluated": 1759,
1901
+ "▁evaluations": 1760,
1902
+ "▁ever": 1761,
1903
+ "▁every": 1762,
1904
+ "▁everyone": 1763,
1905
+ "▁example.": 1764,
1906
+ "▁examples,": 1765,
1907
+ "▁examples.": 1766,
1908
+ "▁excited": 1767,
1909
+ "▁execution,": 1768,
1910
+ "▁exhibit": 1769,
1911
+ "▁expected": 1770,
1912
+ "▁expensive,": 1771,
1913
+ "▁experience": 1772,
1914
+ "▁expertise": 1773,
1915
+ "▁expertise,": 1774,
1916
+ "▁experts": 1775,
1917
+ "▁exploring,": 1776,
1918
+ "▁exposing": 1777,
1919
+ "▁extend": 1778,
1920
+ "▁extensive": 1779,
1921
+ "▁extensive,": 1780,
1922
+ "▁extent.": 1781,
1923
+ "▁extraction.": 1782,
1924
+ "▁eye": 1783,
1925
+ "▁fact,": 1784,
1926
+ "▁fails,": 1785,
1927
+ "▁fails.": 1786,
1928
+ "▁fair": 1787,
1929
+ "▁faithfulness,": 1788,
1930
+ "▁fall": 1789,
1931
+ "▁familiar": 1790,
1932
+ "▁familiarity": 1791,
1933
+ "▁far": 1792,
1934
+ "▁fascinates": 1793,
1935
+ "▁fascinating": 1794,
1936
+ "▁fast.": 1795,
1937
+ "▁fastest": 1796,
1938
+ "▁fed": 1797,
1939
+ "▁feed": 1798,
1940
+ "▁feed-forward": 1799,
1941
+ "▁feedback.": 1800,
1942
+ "▁feeding": 1801,
1943
+ "▁feel": 1802,
1944
+ "▁few-shot": 1803,
1945
+ "▁fewer": 1804,
1946
+ "▁field": 1805,
1947
+ "▁field.": 1806,
1948
+ "▁fields": 1807,
1949
+ "▁finance,": 1808,
1950
+ "▁financial": 1809,
1951
+ "▁fine-tuned": 1810,
1952
+ "▁fine-tuning,": 1811,
1953
+ "▁fine-tuning.": 1812,
1954
+ "▁finished": 1813,
1955
+ "▁fit.": 1814,
1956
+ "▁fix": 1815,
1957
+ "▁fixing": 1816,
1958
+ "▁flagged": 1817,
1959
+ "▁flows,": 1818,
1960
+ "▁fluency,": 1819,
1961
+ "▁focus": 1820,
1962
+ "▁follow": 1821,
1963
+ "▁follow-up": 1822,
1964
+ "▁follow.": 1823,
1965
+ "▁followed": 1824,
1966
+ "▁form": 1825,
1967
+ "▁format": 1826,
1968
+ "▁formats": 1827,
1969
+ "▁forms,": 1828,
1970
+ "▁forums": 1829,
1971
+ "▁foundational": 1830,
1972
+ "▁four": 1831,
1973
+ "▁frameworks,": 1832,
1974
+ "▁free": 1833,
1975
+ "▁freedom": 1834,
1976
+ "▁frequent": 1835,
1977
+ "▁friendly": 1836,
1978
+ "▁from.": 1837,
1979
+ "▁full-scale": 1838,
1980
+ "▁function.": 1839,
1981
+ "▁functions": 1840,
1982
+ "▁fundamentals": 1841,
1983
+ "▁future.": 1842,
1984
+ "▁game-changer": 1843,
1985
+ "▁game-changing": 1844,
1986
+ "▁game.": 1845,
1987
+ "▁gateway:": 1846,
1988
+ "▁gather": 1847,
1989
+ "▁gauge": 1848,
1990
+ "▁gemini@gmail.com": 1849,
1991
+ "▁general": 1850,
1992
+ "▁general-purpose": 1851,
1993
+ "▁generalized": 1852,
1994
+ "▁generates": 1853,
1995
+ "▁generating": 1854,
1996
+ "▁generator": 1855,
1997
+ "▁generator,": 1856,
1998
+ "▁generators": 1857,
1999
+ "▁generic,": 1858,
2000
+ "▁git": 1859,
2001
+ "▁given": 1860,
2002
+ "▁go": 1861,
2003
+ "▁goal,": 1862,
2004
+ "▁goal.": 1863,
2005
+ "▁golden": 1864,
2006
+ "▁good?": 1865,
2007
+ "▁governed": 1866,
2008
+ "▁gpt-4@gmail.com": 1867,
2009
+ "▁gradient": 1868,
2010
+ "▁granting": 1869,
2011
+ "▁graphical": 1870,
2012
+ "▁great": 1871,
2013
+ "▁great,": 1872,
2014
+ "▁greatest": 1873,
2015
+ "▁grow": 1874,
2016
+ "▁guides": 1875,
2017
+ "▁hand,": 1876,
2018
+ "▁hand-crafting": 1877,
2019
+ "▁happy": 1878,
2020
+ "▁hardware.": 1879,
2021
+ "▁hardware–software": 1880,
2022
+ "▁harmful": 1881,
2023
+ "▁harmonize": 1882,
2024
+ "▁haven’t": 1883,
2025
+ "▁heads": 1884,
2026
+ "▁healthcare,": 1885,
2027
+ "▁heart,": 1886,
2028
+ "▁heavily": 1887,
2029
+ "▁here\",": 1888,
2030
+ "▁here.": 1889,
2031
+ "▁heterogeneous": 1890,
2032
+ "▁high-volume,": 1891,
2033
+ "▁highly": 1892,
2034
+ "▁hot": 1893,
2035
+ "▁hours": 1894,
2036
+ "▁hours,": 1895,
2037
+ "▁how.": 1896,
2038
+ "▁https://github.com/SreeEswaran/Train-your-LLM": 1897,
2039
+ "▁human-labeled": 1898,
2040
+ "▁human.": 1899,
2041
+ "▁humans": 1900,
2042
+ "▁hundred": 1901,
2043
+ "▁hybrid": 1902,
2044
+ "▁hyperparameters,": 1903,
2045
+ "▁hyperparameters.": 1904,
2046
+ "▁i.e.,": 1905,
2047
+ "▁identifiable": 1906,
2048
+ "▁identify": 1907,
2049
+ "▁identity": 1908,
2050
+ "▁if:": 1909,
2051
+ "▁illustrate": 1910,
2052
+ "▁illustrates": 1911,
2053
+ "▁immediately": 1912,
2054
+ "▁immense": 1913,
2055
+ "▁implement": 1914,
2056
+ "▁implemented": 1915,
2057
+ "▁implications": 1916,
2058
+ "▁important.": 1917,
2059
+ "▁imported": 1918,
2060
+ "▁impressive": 1919,
2061
+ "▁improvement.": 1920,
2062
+ "▁improvements,": 1921,
2063
+ "▁in-context": 1922,
2064
+ "▁in.": 1923,
2065
+ "▁include:": 1924,
2066
+ "▁incognito": 1925,
2067
+ "▁incorporate": 1926,
2068
+ "▁incorrect!": 1927,
2069
+ "▁increase": 1928,
2070
+ "▁increased": 1929,
2071
+ "▁increasing": 1930,
2072
+ "▁increasingly": 1931,
2073
+ "▁indicates": 1932,
2074
+ "▁individual": 1933,
2075
+ "▁industries": 1934,
2076
+ "▁industries.": 1935,
2077
+ "▁industry": 1936,
2078
+ "▁industry,": 1937,
2079
+ "▁industry:": 1938,
2080
+ "▁influenced": 1939,
2081
+ "▁information,": 1940,
2082
+ "▁informed": 1941,
2083
+ "▁infrastructure.": 1942,
2084
+ "▁innovation": 1943,
2085
+ "▁innovation,": 1944,
2086
+ "▁innovative": 1945,
2087
+ "▁input.": 1946,
2088
+ "▁inputs": 1947,
2089
+ "▁inside": 1948,
2090
+ "▁insightful,": 1949,
2091
+ "▁insights": 1950,
2092
+ "▁insights!": 1951,
2093
+ "▁installed.": 1952,
2094
+ "▁instance,": 1953,
2095
+ "▁instances.": 1954,
2096
+ "▁instantiated": 1955,
2097
+ "▁integrate": 1956,
2098
+ "▁integrating": 1957,
2099
+ "▁intensive": 1958,
2100
+ "▁interaction,": 1959,
2101
+ "▁intimidating": 1960,
2102
+ "▁intricacy": 1961,
2103
+ "▁intricate": 1962,
2104
+ "▁introduced": 1963,
2105
+ "▁introduces": 1964,
2106
+ "▁invest": 1965,
2107
+ "▁involve": 1966,
2108
+ "▁involved": 1967,
2109
+ "▁involving": 1968,
2110
+ "▁irrelevant": 1969,
2111
+ "▁is.": 1970,
2112
+ "▁isn’t": 1971,
2113
+ "▁issue:": 1972,
2114
+ "▁it.": 1973,
2115
+ "▁iteratively,": 1974,
2116
+ "▁job.": 1975,
2117
+ "▁joining": 1976,
2118
+ "▁journey,": 1977,
2119
+ "▁kept": 1978,
2120
+ "▁key.": 1979,
2121
+ "▁kind": 1980,
2122
+ "▁kinds": 1981,
2123
+ "▁knowledge": 1982,
2124
+ "▁knowledge.": 1983,
2125
+ "▁knowledgeable": 1984,
2126
+ "▁labor": 1985,
2127
+ "▁labs,": 1986,
2128
+ "▁language,": 1987,
2129
+ "▁language-related": 1988,
2130
+ "▁larger": 1989,
2131
+ "▁last": 1990,
2132
+ "▁late": 1991,
2133
+ "▁later": 1992,
2134
+ "▁later.)": 1993,
2135
+ "▁laws": 1994,
2136
+ "▁layer,": 1995,
2137
+ "▁leading": 1996,
2138
+ "▁leaked": 1997,
2139
+ "▁learning.": 1998,
2140
+ "▁least": 1999,
2141
+ "▁left.": 2000,
2142
+ "▁lemmatization": 2001,
2143
+ "▁length": 2002,
2144
+ "▁length),": 2003,
2145
+ "▁length.": 2004,
2146
+ "▁lengthy": 2005,
2147
+ "▁less": 2006,
2148
+ "▁let": 2007,
2149
+ "▁level,": 2008,
2150
+ "▁level.": 2009,
2151
+ "▁leverage": 2010,
2152
+ "▁leverages": 2011,
2153
+ "▁leveraging": 2012,
2154
+ "▁libraries.": 2013,
2155
+ "▁library": 2014,
2156
+ "▁licensing": 2015,
2157
+ "▁lies": 2016,
2158
+ "▁light": 2017,
2159
+ "▁like:": 2018,
2160
+ "▁liked": 2019,
2161
+ "▁likely": 2020,
2162
+ "▁limitations": 2021,
2163
+ "▁limited": 2022,
2164
+ "▁line": 2023,
2165
+ "▁line,": 2024,
2166
+ "▁linear": 2025,
2167
+ "▁lines:": 2026,
2168
+ "▁lingo": 2027,
2169
+ "▁list": 2028,
2170
+ "▁literature,": 2029,
2171
+ "▁llama@gmail.com": 2030,
2172
+ "▁logging": 2031,
2173
+ "▁logging.": 2032,
2174
+ "▁logging_dir='./logs',": 2033,
2175
+ "▁logs": 2034,
2176
+ "▁logs,": 2035,
2177
+ "▁long-context": 2036,
2178
+ "▁loop": 2037,
2179
+ "▁lower-cost": 2038,
2180
+ "▁lowercase": 2039,
2181
+ "▁lowercase,": 2040,
2182
+ "▁magnitude": 2041,
2183
+ "▁maintaining": 2042,
2184
+ "▁maintenance": 2043,
2185
+ "▁makes": 2044,
2186
+ "▁manage": 2045,
2187
+ "▁manageable": 2046,
2188
+ "▁manner.": 2047,
2189
+ "▁manually": 2048,
2190
+ "▁map": 2049,
2191
+ "▁market": 2050,
2192
+ "▁market.": 2051,
2193
+ "▁masking": 2052,
2194
+ "▁masking,": 2053,
2195
+ "▁mass": 2054,
2196
+ "▁master": 2055,
2197
+ "▁match": 2056,
2198
+ "▁math": 2057,
2199
+ "▁mathematical": 2058,
2200
+ "▁mathematicians.": 2059,
2201
+ "▁mathematics": 2060,
2202
+ "▁mathematics.": 2061,
2203
+ "▁matter": 2062,
2204
+ "▁matter.": 2063,
2205
+ "▁maximize": 2064,
2206
+ "▁media.": 2065,
2207
+ "▁mentioned": 2066,
2208
+ "▁met": 2067,
2209
+ "▁meticulous": 2068,
2210
+ "▁metric": 2069,
2211
+ "▁millions.": 2070,
2212
+ "▁min": 2071,
2213
+ "▁mind": 2072,
2214
+ "▁mind.": 2073,
2215
+ "▁mindful": 2074,
2216
+ "▁mini-batches": 2075,
2217
+ "▁miniature": 2076,
2218
+ "▁minimize": 2077,
2219
+ "▁missing": 2078,
2220
+ "▁mission)": 2079,
2221
+ "▁mistakes": 2080,
2222
+ "▁mistral": 2081,
2223
+ "▁mistral@gmail.com": 2082,
2224
+ "▁misuse": 2083,
2225
+ "▁mitigate": 2084,
2226
+ "▁mitigation": 2085,
2227
+ "▁mix": 2086,
2228
+ "▁mixture-of-experts": 2087,
2229
+ "▁mixture-of-experts,": 2088,
2230
+ "▁model=model,": 2089,
2231
+ "▁model?": 2090,
2232
+ "▁modeling": 2091,
2233
+ "▁models)": 2092,
2234
+ "▁models:": 2093,
2235
+ "▁modern": 2094,
2236
+ "▁modify": 2095,
2237
+ "▁modules": 2096,
2238
+ "▁money": 2097,
2239
+ "▁months.": 2098,
2240
+ "▁moral.": 2099,
2241
+ "▁more!": 2100,
2242
+ "▁more,": 2101,
2243
+ "▁much": 2102,
2244
+ "▁multi-head": 2103,
2245
+ "▁multiple": 2104,
2246
+ "▁native": 2105,
2247
+ "▁naturally": 2106,
2248
+ "▁nature": 2107,
2249
+ "▁nature.": 2108,
2250
+ "▁needed.": 2109,
2251
+ "▁network": 2110,
2252
+ "▁network.": 2111,
2253
+ "▁networking.": 2112,
2254
+ "▁news": 2113,
2255
+ "▁next.": 2114,
2256
+ "▁non-linear": 2115,
2257
+ "▁non-negotiable.": 2116,
2258
+ "▁normalized,": 2117,
2259
+ "▁note": 2118,
2260
+ "▁notice": 2119,
2261
+ "▁now,": 2120,
2262
+ "▁nuances": 2121,
2263
+ "▁nudging": 2122,
2264
+ "▁num_train_epochs=3,": 2123,
2265
+ "▁numerical": 2124,
2266
+ "▁numerous": 2125,
2267
+ "▁observability": 2126,
2268
+ "▁offensive": 2127,
2269
+ "▁offering": 2128,
2270
+ "▁offers": 2129,
2271
+ "▁on.": 2130,
2272
+ "▁one.": 2131,
2273
+ "▁ones": 2132,
2274
+ "▁ongoing": 2133,
2275
+ "▁open-weight": 2134,
2276
+ "▁openai,": 2135,
2277
+ "▁opportunities": 2136,
2278
+ "▁opposing": 2137,
2279
+ "▁optimization": 2138,
2280
+ "▁optimize": 2139,
2281
+ "▁opting": 2140,
2282
+ "▁option:": 2141,
2283
+ "▁options.": 2142,
2284
+ "▁orders:": 2143,
2285
+ "▁organization’s": 2144,
2286
+ "▁original": 2145,
2287
+ "▁other,": 2146,
2288
+ "▁others)": 2147,
2289
+ "▁otherwise": 2148,
2290
+ "▁output.": 2149,
2291
+ "▁output_dir='./results',": 2150,
2292
+ "▁outputs.": 2151,
2293
+ "▁overall": 2152,
2294
+ "▁overfitting": 2153,
2295
+ "▁page,": 2154,
2296
+ "▁paper": 2155,
2297
+ "▁papers.": 2156,
2298
+ "▁paragraphs": 2157,
2299
+ "▁parallel,": 2158,
2300
+ "▁particular": 2159,
2301
+ "▁partition": 2160,
2302
+ "▁partner": 2161,
2303
+ "▁parts,": 2162,
2304
+ "▁parts:": 2163,
2305
+ "▁passed": 2164,
2306
+ "▁patience": 2165,
2307
+ "▁pattern": 2166,
2308
+ "▁patterns": 2167,
2309
+ "▁patterns.": 2168,
2310
+ "▁peculiarities": 2169,
2311
+ "▁people": 2170,
2312
+ "▁per": 2171,
2313
+ "▁per_device_eval_batch_size=4,": 2172,
2314
+ "▁per_device_train_batch_size=4,": 2173,
2315
+ "▁perfect": 2174,
2316
+ "▁perform,": 2175,
2317
+ "▁performance.3": 2176,
2318
+ "▁performs:": 2177,
2319
+ "▁perplexity": 2178,
2320
+ "▁perplexity,": 2179,
2321
+ "▁persist": 2180,
2322
+ "▁personal": 2181,
2323
+ "▁pertinent": 2182,
2324
+ "▁phase,": 2183,
2325
+ "▁philosophical": 2184,
2326
+ "▁phone": 2185,
2327
+ "▁pieces": 2186,
2328
+ "▁pipelines,": 2187,
2329
+ "▁place": 2188,
2330
+ "▁placeholder": 2189,
2331
+ "▁plan": 2190,
2332
+ "▁plans": 2191,
2333
+ "▁platform,": 2192,
2334
+ "▁platform.": 2193,
2335
+ "▁platforms": 2194,
2336
+ "▁play.": 2195,
2337
+ "▁please": 2196,
2338
+ "▁plethora": 2197,
2339
+ "▁plug": 2198,
2340
+ "▁poetry": 2199,
2341
+ "▁policies": 2200,
2342
+ "▁possibilities": 2201,
2343
+ "▁possible.": 2202,
2344
+ "▁post,": 2203,
2345
+ "▁postdoc": 2204,
2346
+ "▁posts": 2205,
2347
+ "▁potent": 2206,
2348
+ "▁potentially": 2207,
2349
+ "▁power,": 2208,
2350
+ "▁power.": 2209,
2351
+ "▁powerful,": 2210,
2352
+ "▁powering": 2211,
2353
+ "▁powers": 2212,
2354
+ "▁practice,": 2213,
2355
+ "▁practices": 2214,
2356
+ "▁pre-processed": 2215,
2357
+ "▁prebuilt": 2216,
2358
+ "▁precision": 2217,
2359
+ "▁precision,": 2218,
2360
+ "▁predicting": 2219,
2361
+ "▁predictions,": 2220,
2362
+ "▁preference": 2221,
2363
+ "▁preferences.": 2222,
2364
+ "▁prepared,": 2223,
2365
+ "▁preprocess": 2224,
2366
+ "▁preprocessed": 2225,
2367
+ "▁presented": 2226,
2368
+ "▁prevent": 2227,
2369
+ "▁price,": 2228,
2370
+ "▁primarily": 2229,
2371
+ "▁principle": 2230,
2372
+ "▁principles": 2231,
2373
+ "▁privacy.": 2232,
2374
+ "▁privilege:": 2233,
2375
+ "▁probably": 2234,
2376
+ "▁problem": 2235,
2377
+ "▁problems.": 2236,
2378
+ "▁procedures.": 2237,
2379
+ "▁processes.5": 2238,
2380
+ "▁processor.": 2239,
2381
+ "▁producing": 2240,
2382
+ "▁product": 2241,
2383
+ "▁production-level": 2242,
2384
+ "▁productivity.": 2243,
2385
+ "▁programmed": 2244,
2386
+ "▁programming": 2245,
2387
+ "▁progressing": 2246,
2388
+ "▁project,": 2247,
2389
+ "▁projects,": 2248,
2390
+ "▁projects.": 2249,
2391
+ "▁promotes": 2250,
2392
+ "▁propose": 2251,
2393
+ "▁proposes": 2252,
2394
+ "▁protect": 2253,
2395
+ "▁proven": 2254,
2396
+ "▁provider": 2255,
2397
+ "▁providers,": 2256,
2398
+ "▁providing": 2257,
2399
+ "▁purpose": 2258,
2400
+ "▁purposes.": 2259,
2401
+ "▁quality.": 2260,
2402
+ "▁quantity.": 2261,
2403
+ "▁rack-scale": 2262,
2404
+ "▁raise": 2263,
2405
+ "▁raises": 2264,
2406
+ "▁rates:": 2265,
2407
+ "▁rationale": 2266,
2408
+ "▁reaches": 2267,
2409
+ "▁read": 2268,
2410
+ "▁ready.": 2269,
2411
+ "▁real-time": 2270,
2412
+ "▁realistic": 2271,
2413
+ "▁reality": 2272,
2414
+ "▁really": 2273,
2415
+ "▁reason": 2274,
2416
+ "▁reasoning.4": 2275,
2417
+ "▁reasons": 2276,
2418
+ "▁recall,": 2277,
2419
+ "▁received": 2278,
2420
+ "▁recent": 2279,
2421
+ "▁recently": 2280,
2422
+ "▁recognition.": 2281,
2423
+ "▁recognize": 2282,
2424
+ "▁reconsider": 2283,
2425
+ "▁records": 2284,
2426
+ "▁record’": 2285,
2427
+ "▁reduce": 2286,
2428
+ "▁reducing": 2287,
2429
+ "▁refine": 2288,
2430
+ "▁reflects": 2289,
2431
+ "▁regard": 2290,
2432
+ "▁regarding": 2291,
2433
+ "▁regardless": 2292,
2434
+ "▁regulations": 2293,
2435
+ "▁regulations,": 2294,
2436
+ "▁relations,": 2295,
2437
+ "▁relationship": 2296,
2438
+ "▁released": 2297,
2439
+ "▁reliable": 2298,
2440
+ "▁rely": 2299,
2441
+ "▁remove": 2300,
2442
+ "▁repairman,": 2301,
2443
+ "▁repairman.": 2302,
2444
+ "▁repeatedly": 2303,
2445
+ "▁replaced": 2304,
2446
+ "▁replaces": 2305,
2447
+ "▁replicas,": 2306,
2448
+ "▁report": 2307,
2449
+ "▁report:": 2308,
2450
+ "▁reportedly": 2309,
2451
+ "▁repository:": 2310,
2452
+ "▁representation,": 2311,
2453
+ "▁representation.": 2312,
2454
+ "▁representations.": 2313,
2455
+ "▁representative": 2314,
2456
+ "▁request": 2315,
2457
+ "▁request.": 2316,
2458
+ "▁required.": 2317,
2459
+ "▁research-level": 2318,
2460
+ "▁reshaped": 2319,
2461
+ "▁resource": 2320,
2462
+ "▁resources.": 2321,
2463
+ "▁responding.": 2322,
2464
+ "▁response": 2323,
2465
+ "▁responsibilities.": 2324,
2466
+ "▁responsible": 2325,
2467
+ "▁restores": 2326,
2468
+ "▁restrict": 2327,
2469
+ "▁restricted": 2328,
2470
+ "▁resulting": 2329,
2471
+ "▁results.": 2330,
2472
+ "▁retention": 2331,
2473
+ "▁retention:": 2332,
2474
+ "▁retraining": 2333,
2475
+ "▁return_tensors='pt')": 2334,
2476
+ "▁reward": 2335,
2477
+ "▁rewarding": 2336,
2478
+ "▁robust": 2337,
2479
+ "▁route": 2338,
2480
+ "▁rules": 2339,
2481
+ "▁run": 2340,
2482
+ "▁run.": 2341,
2483
+ "▁runs.": 2342,
2484
+ "▁samples": 2343,
2485
+ "▁sampling:": 2344,
2486
+ "▁satisfactory": 2345,
2487
+ "▁satisfactory,": 2346,
2488
+ "▁saving": 2347,
2489
+ "▁scalable": 2348,
2490
+ "▁scale": 2349,
2491
+ "▁scaled": 2350,
2492
+ "▁scheduler": 2351,
2493
+ "▁scholarly": 2352,
2494
+ "▁scientific": 2353,
2495
+ "▁score": 2354,
2496
+ "▁score,": 2355,
2497
+ "▁scratch,": 2356,
2498
+ "▁scrub": 2357,
2499
+ "▁search": 2358,
2500
+ "▁search,": 2359,
2501
+ "▁secure": 2360,
2502
+ "▁secured": 2361,
2503
+ "▁security:": 2362,
2504
+ "▁seemingly": 2363,
2505
+ "▁seems": 2364,
2506
+ "▁select": 2365,
2507
+ "▁selected": 2366,
2508
+ "▁selected.": 2367,
2509
+ "▁selection": 2368,
2510
+ "▁separate": 2369,
2511
+ "▁series": 2370,
2512
+ "▁server.": 2371,
2513
+ "▁servers": 2372,
2514
+ "▁service": 2373,
2515
+ "▁service,": 2374,
2516
+ "▁services,": 2375,
2517
+ "▁session": 2376,
2518
+ "▁sets": 2377,
2519
+ "▁setup,": 2378,
2520
+ "▁shaped": 2379,
2521
+ "▁shared": 2380,
2522
+ "▁sharing": 2381,
2523
+ "▁shed": 2382,
2524
+ "▁short": 2383,
2525
+ "▁shown": 2384,
2526
+ "▁shows": 2385,
2527
+ "▁signals": 2386,
2528
+ "▁simplify": 2387,
2529
+ "▁simply": 2388,
2530
+ "▁since": 2389,
2531
+ "▁sits": 2390,
2532
+ "▁six": 2391,
2533
+ "▁size)": 2392,
2534
+ "▁sizes": 2393,
2535
+ "▁sizes:": 2394,
2536
+ "▁skill.": 2395,
2537
+ "▁skills": 2396,
2538
+ "▁skills.": 2397,
2539
+ "▁smart": 2398,
2540
+ "▁snippets": 2399,
2541
+ "▁social": 2400,
2542
+ "▁software": 2401,
2543
+ "▁solid": 2402,
2544
+ "▁solutions,": 2403,
2545
+ "▁sophistication,": 2404,
2546
+ "▁sourced": 2405,
2547
+ "▁space.": 2406,
2548
+ "▁sparse": 2407,
2549
+ "▁speak": 2408,
2550
+ "▁specialize": 2409,
2551
+ "▁specifications": 2410,
2552
+ "▁specificity.": 2411,
2553
+ "▁specified": 2412,
2554
+ "▁speed.": 2413,
2555
+ "▁speedups.": 2414,
2556
+ "▁spend": 2415,
2557
+ "▁spikes)": 2416,
2558
+ "▁spread": 2417,
2559
+ "▁stabilize": 2418,
2560
+ "▁stages:": 2419,
2561
+ "▁stand": 2420,
2562
+ "▁standardize": 2421,
2563
+ "▁standards": 2422,
2564
+ "▁stands": 2423,
2565
+ "▁start.": 2424,
2566
+ "▁start?": 2425,
2567
+ "▁static": 2426,
2568
+ "▁statistical": 2427,
2569
+ "▁stay": 2428,
2570
+ "▁stemming.": 2429,
2571
+ "▁steps).": 2430,
2572
+ "▁steps.": 2431,
2573
+ "▁stop": 2432,
2574
+ "▁stopped": 2433,
2575
+ "▁storage,": 2434,
2576
+ "▁store": 2435,
2577
+ "▁stored": 2436,
2578
+ "▁stories": 2437,
2579
+ "▁stories,": 2438,
2580
+ "▁stories.": 2439,
2581
+ "▁strategy,": 2440,
2582
+ "▁strength": 2441,
2583
+ "▁structure": 2442,
2584
+ "▁styles,": 2443,
2585
+ "▁suboptimal": 2444,
2586
+ "▁subsequent": 2445,
2587
+ "▁subsets": 2446,
2588
+ "▁substantial": 2447,
2589
+ "▁substantially": 2448,
2590
+ "▁subtleties": 2449,
2591
+ "▁subword": 2450,
2592
+ "▁subwords.": 2451,
2593
+ "▁success": 2452,
2594
+ "▁sufficient": 2453,
2595
+ "▁sufficiently": 2454,
2596
+ "▁suitable": 2455,
2597
+ "▁summarization": 2456,
2598
+ "▁summarization.": 2457,
2599
+ "▁summarizing": 2458,
2600
+ "▁supercomputer": 2459,
2601
+ "▁supercomputers.": 2460,
2602
+ "▁supervised": 2461,
2603
+ "▁supports": 2462,
2604
+ "▁sure": 2463,
2605
+ "▁surpass": 2464,
2606
+ "▁symbols.": 2465,
2607
+ "▁system.": 2466,
2608
+ "▁systematic": 2467,
2609
+ "▁systems,": 2468,
2610
+ "▁tailoring": 2469,
2611
+ "▁taken": 2470,
2612
+ "▁talent,": 2471,
2613
+ "▁talked": 2472,
2614
+ "▁target": 2473,
2615
+ "▁task,": 2474,
2616
+ "▁task-specific": 2475,
2617
+ "▁teaching": 2476,
2618
+ "▁team,": 2477,
2619
+ "▁technical": 2478,
2620
+ "▁techniques)": 2479,
2621
+ "▁techniques.": 2480,
2622
+ "▁technologies": 2481,
2623
+ "▁templates": 2482,
2624
+ "▁tension": 2483,
2625
+ "▁terminology,": 2484,
2626
+ "▁tests.": 2485,
2627
+ "▁text-generation": 2486,
2628
+ "▁texting": 2487,
2629
+ "▁texts": 2488,
2630
+ "▁texts,": 2489,
2631
+ "▁that’s": 2490,
2632
+ "▁them.": 2491,
2633
+ "▁there": 2492,
2634
+ "▁thing": 2493,
2635
+ "▁third-party": 2494,
2636
+ "▁thoughts": 2495,
2637
+ "▁thousands": 2496,
2638
+ "▁through.": 2497,
2639
+ "▁tight": 2498,
2640
+ "▁time,": 2499,
2641
+ "▁to:": 2500,
2642
+ "▁today's": 2501,
2643
+ "▁today,": 2502,
2644
+ "▁token?”": 2503,
2645
+ "▁tokenization.": 2504,
2646
+ "▁tokenize": 2505,
2647
+ "▁tokenizer": 2506,
2648
+ "▁tokenizer(\"Your": 2507,
2649
+ "▁tokenizing": 2508,
2650
+ "▁tokens)": 2509,
2651
+ "▁tone": 2510,
2652
+ "▁tone,": 2511,
2653
+ "▁took": 2512,
2654
+ "▁tools,": 2513,
2655
+ "▁top": 2514,
2656
+ "▁topic": 2515,
2657
+ "▁topics,": 2516,
2658
+ "▁topics.": 2517,
2659
+ "▁torch": 2518,
2660
+ "▁touch": 2519,
2661
+ "▁tracing": 2520,
2662
+ "▁traditional": 2521,
2663
+ "▁trail:": 2522,
2664
+ "▁train),": 2523,
2665
+ "▁train_dataset=train_dataset,": 2524,
2666
+ "▁trained.": 2525,
2667
+ "▁trainer": 2526,
2668
+ "▁trainer.train()": 2527,
2669
+ "▁training?": 2528,
2670
+ "▁training_args": 2529,
2671
+ "▁transformation": 2530,
2672
+ "▁translate": 2531,
2673
+ "▁translating": 2532,
2674
+ "▁translation.": 2533,
2675
+ "▁transmits": 2534,
2676
+ "▁trillion": 2535,
2677
+ "▁truly": 2536,
2678
+ "▁trusted": 2537,
2679
+ "▁trying": 2538,
2680
+ "▁tweak": 2539,
2681
+ "▁types.": 2540,
2682
+ "▁typically": 2541,
2683
+ "▁typos,": 2542,
2684
+ "▁undergraduate": 2543,
2685
+ "▁understand.": 2544,
2686
+ "▁understanding.": 2545,
2687
+ "▁undertaking.": 2546,
2688
+ "▁unit)": 2547,
2689
+ "▁units": 2548,
2690
+ "▁units),": 2549,
2691
+ "▁unlocks": 2550,
2692
+ "▁unnecessary": 2551,
2693
+ "▁unpublished,": 2552,
2694
+ "▁unseen": 2553,
2695
+ "▁unsolved": 2554,
2696
+ "▁until": 2555,
2697
+ "▁up,": 2556,
2698
+ "▁up/down": 2557,
2699
+ "▁updates": 2558,
2700
+ "▁updating": 2559,
2701
+ "▁upfront,": 2560,
2702
+ "▁usage": 2561,
2703
+ "▁used.": 2562,
2704
+ "▁users’": 2563,
2705
+ "▁uses": 2564,
2706
+ "▁utilization,": 2565,
2707
+ "▁utilized": 2566,
2708
+ "▁valuable": 2567,
2709
+ "▁valuable,": 2568,
2710
+ "▁values.": 2569,
2711
+ "▁variables": 2570,
2712
+ "▁variations.": 2571,
2713
+ "▁vast": 2572,
2714
+ "▁vector": 2573,
2715
+ "▁versatile.": 2574,
2716
+ "▁versatility": 2575,
2717
+ "▁version": 2576,
2718
+ "▁very": 2577,
2719
+ "▁video,": 2578,
2720
+ "▁visualize": 2579,
2721
+ "▁vitality": 2580,
2722
+ "▁volume": 2581,
2723
+ "▁wanted": 2582,
2724
+ "▁wants": 2583,
2725
+ "▁warmup": 2584,
2726
+ "▁warmup_steps=500,": 2585,
2727
+ "▁wasn’t": 2586,
2728
+ "▁way": 2587,
2729
+ "▁way,": 2588,
2730
+ "▁way.": 2589,
2731
+ "▁ways": 2590,
2732
+ "▁website,": 2591,
2733
+ "▁weeks": 2592,
2734
+ "▁weeks.": 2593,
2735
+ "▁weight_decay=0.01,": 2594,
2736
+ "▁well-formatted.": 2595,
2737
+ "▁well-known": 2596,
2738
+ "▁well:": 2597,
2739
+ "▁while": 2598,
2740
+ "▁whole": 2599,
2741
+ "▁wide": 2600,
2742
+ "▁widespread.": 2601,
2743
+ "▁with,": 2602,
2744
+ "▁with?": 2603,
2745
+ "▁within": 2604,
2746
+ "▁word,": 2605,
2747
+ "▁word.": 2606,
2748
+ "▁words.": 2607,
2749
+ "▁work,": 2608,
2750
+ "▁workflow.": 2609,
2751
+ "▁workflows": 2610,
2752
+ "▁working": 2611,
2753
+ "▁works": 2612,
2754
+ "▁works:": 2613,
2755
+ "▁worry,": 2614,
2756
+ "▁worthwhile": 2615,
2757
+ "▁wouldn’t": 2616,
2758
+ "▁writers,": 2617,
2759
+ "▁wrong": 2618,
2760
+ "▁year.": 2619,
2761
+ "▁years.": 2620,
2762
+ "▁you,": 2621,
2763
+ "▁zero": 2622,
2764
+ "▁zero-shot": 2623,
2765
+ "▁–": 2624,
2766
+ "▁“large”": 2625,
2767
+ "▁“use": 2626,
2768
+ "▁“what’s": 2627
2769
+ },
2770
+ "merges": []
2771
+ }
2772
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "backend": "tokenizers",
3
+ "bos_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "extra_special_tokens": [
6
+ "<pad>",
7
+ "</s>",
8
+ "<s>",
9
+ "<maskSub>",
10
+ "Question:",
11
+ "Réponse:"
12
+ ],
13
+ "is_local": true,
14
+ "mask_token": "<mask>",
15
+ "model_max_length": 1000000000000000019884624838656,
16
+ "pad_token": "<pad>",
17
+ "tokenizer_class": "GemmaTokenizer",
18
+ "unk_token": "<unk>",
19
+ "vocab_file": "tokenizer-sora/tokenizer.model"
20
+ }