laura2243 commited on
Commit
c56862e
·
verified ·
1 Parent(s): b2078f2

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,396 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - cross-encoder
5
+ - reranker
6
+ - generated_from_trainer
7
+ - dataset_size:43188
8
+ - loss:BinaryCrossEntropyLoss
9
+ base_model: cross-encoder/nli-deberta-v3-base
10
+ pipeline_tag: text-ranking
11
+ library_name: sentence-transformers
12
+ metrics:
13
+ - accuracy
14
+ - accuracy_threshold
15
+ - f1
16
+ - f1_threshold
17
+ - precision
18
+ - recall
19
+ - average_precision
20
+ model-index:
21
+ - name: CrossEncoder based on cross-encoder/nli-deberta-v3-base
22
+ results:
23
+ - task:
24
+ type: cross-encoder-binary-classification
25
+ name: Cross Encoder Binary Classification
26
+ dataset:
27
+ name: paws val judge
28
+ type: paws-val-judge
29
+ metrics:
30
+ - type: accuracy
31
+ value: 0.9645748987854251
32
+ name: Accuracy
33
+ - type: accuracy_threshold
34
+ value: 0.08707074075937271
35
+ name: Accuracy Threshold
36
+ - type: f1
37
+ value: 0.9604876947392187
38
+ name: F1
39
+ - type: f1_threshold
40
+ value: 0.08707074075937271
41
+ name: F1 Threshold
42
+ - type: precision
43
+ value: 0.9470169189670525
44
+ name: Precision
45
+ - type: recall
46
+ value: 0.9743472285845167
47
+ name: Recall
48
+ - type: average_precision
49
+ value: 0.9870268561433264
50
+ name: Average Precision
51
+ ---
52
+
53
+ # CrossEncoder based on cross-encoder/nli-deberta-v3-base
54
+
55
+ This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [cross-encoder/nli-deberta-v3-base](https://huggingface.co/cross-encoder/nli-deberta-v3-base) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
56
+
57
+ ## Model Details
58
+
59
+ ### Model Description
60
+ - **Model Type:** Cross Encoder
61
+ - **Base model:** [cross-encoder/nli-deberta-v3-base](https://huggingface.co/cross-encoder/nli-deberta-v3-base) <!-- at revision 6c749ce3425cd33b46d187e45b92bbf96ee12ec7 -->
62
+ - **Maximum Sequence Length:** 512 tokens
63
+ - **Number of Output Labels:** 1 label
64
+ <!-- - **Training Dataset:** Unknown -->
65
+ <!-- - **Language:** Unknown -->
66
+ <!-- - **License:** Unknown -->
67
+
68
+ ### Model Sources
69
+
70
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
71
+ - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
72
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
73
+ - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
74
+
75
+ ## Usage
76
+
77
+ ### Direct Usage (Sentence Transformers)
78
+
79
+ First install the Sentence Transformers library:
80
+
81
+ ```bash
82
+ pip install -U sentence-transformers
83
+ ```
84
+
85
+ Then you can load this model and run inference.
86
+ ```python
87
+ from sentence_transformers import CrossEncoder
88
+
89
+ # Download from the 🤗 Hub
90
+ model = CrossEncoder("cross_encoder_model_id")
91
+ # Get scores for pairs of texts
92
+ pairs = [
93
+ ['Route 309 is a Connecticut State Highway in the northwestern Hartford suburbs from Canton to Simsbury .', 'Route 309 runs a Canton State Highway in the northwestern Connecticut suburbs from Hartford to Simsbury .'],
94
+ ['During the competition she lost 50-25 to Zimbabwe , 84-16 to Tanzania , 58-24 to South Africa .', 'During the competition , they lost 50-25 to Zimbabwe , 84-16 to Tanzania , 58-24 to South Africa .'],
95
+ ['The latter study is one of the few prospective demonstrations that environmental stress with high blood pressure and LVH remains associated .', 'The latter study remains one of the few prospective demonstrations that environmental stress with high blood pressure and LVH is associated .'],
96
+ ['The Marignane is located at Marseille Airport in Provence .', 'The Marignane is located in Marseille Provence Airport .'],
97
+ ['Birleffi was of Italian descent and Roman - Catholic in a predominantly Protestant state .', 'Birleffi was of Italian ethnicity and Roman Catholic in a predominantly Protestant state .'],
98
+ ]
99
+ scores = model.predict(pairs)
100
+ print(scores.shape)
101
+ # (5,)
102
+
103
+ # Or rank different texts based on similarity to a single text
104
+ ranks = model.rank(
105
+ 'Route 309 is a Connecticut State Highway in the northwestern Hartford suburbs from Canton to Simsbury .',
106
+ [
107
+ 'Route 309 runs a Canton State Highway in the northwestern Connecticut suburbs from Hartford to Simsbury .',
108
+ 'During the competition , they lost 50-25 to Zimbabwe , 84-16 to Tanzania , 58-24 to South Africa .',
109
+ 'The latter study remains one of the few prospective demonstrations that environmental stress with high blood pressure and LVH is associated .',
110
+ 'The Marignane is located in Marseille Provence Airport .',
111
+ 'Birleffi was of Italian ethnicity and Roman Catholic in a predominantly Protestant state .',
112
+ ]
113
+ )
114
+ # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
115
+ ```
116
+
117
+ <!--
118
+ ### Direct Usage (Transformers)
119
+
120
+ <details><summary>Click to see the direct usage in Transformers</summary>
121
+
122
+ </details>
123
+ -->
124
+
125
+ <!--
126
+ ### Downstream Usage (Sentence Transformers)
127
+
128
+ You can finetune this model on your own dataset.
129
+
130
+ <details><summary>Click to expand</summary>
131
+
132
+ </details>
133
+ -->
134
+
135
+ <!--
136
+ ### Out-of-Scope Use
137
+
138
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
139
+ -->
140
+
141
+ ## Evaluation
142
+
143
+ ### Metrics
144
+
145
+ #### Cross Encoder Binary Classification
146
+
147
+ * Dataset: `paws-val-judge`
148
+ * Evaluated with [<code>CEBinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CEBinaryClassificationEvaluator)
149
+
150
+ | Metric | Value |
151
+ |:----------------------|:----------|
152
+ | accuracy | 0.9646 |
153
+ | accuracy_threshold | 0.0871 |
154
+ | f1 | 0.9605 |
155
+ | f1_threshold | 0.0871 |
156
+ | precision | 0.947 |
157
+ | recall | 0.9743 |
158
+ | **average_precision** | **0.987** |
159
+
160
+ <!--
161
+ ## Bias, Risks and Limitations
162
+
163
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
164
+ -->
165
+
166
+ <!--
167
+ ### Recommendations
168
+
169
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
170
+ -->
171
+
172
+ ## Training Details
173
+
174
+ ### Training Dataset
175
+
176
+ #### Unnamed Dataset
177
+
178
+ * Size: 43,188 training samples
179
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
180
+ * Approximate statistics based on the first 1000 samples:
181
+ | | sentence_0 | sentence_1 | label |
182
+ |:--------|:-------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:---------------------------------------------------------------|
183
+ | type | string | string | float |
184
+ | details | <ul><li>min: 38 characters</li><li>mean: 114.71 characters</li><li>max: 200 characters</li></ul> | <ul><li>min: 42 characters</li><li>mean: 114.33 characters</li><li>max: 215 characters</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.46</li><li>max: 1.0</li></ul> |
185
+ * Samples:
186
+ | sentence_0 | sentence_1 | label |
187
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
188
+ | <code>Route 309 is a Connecticut State Highway in the northwestern Hartford suburbs from Canton to Simsbury .</code> | <code>Route 309 runs a Canton State Highway in the northwestern Connecticut suburbs from Hartford to Simsbury .</code> | <code>0.0</code> |
189
+ | <code>During the competition she lost 50-25 to Zimbabwe , 84-16 to Tanzania , 58-24 to South Africa .</code> | <code>During the competition , they lost 50-25 to Zimbabwe , 84-16 to Tanzania , 58-24 to South Africa .</code> | <code>1.0</code> |
190
+ | <code>The latter study is one of the few prospective demonstrations that environmental stress with high blood pressure and LVH remains associated .</code> | <code>The latter study remains one of the few prospective demonstrations that environmental stress with high blood pressure and LVH is associated .</code> | <code>1.0</code> |
191
+ * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
192
+ ```json
193
+ {
194
+ "activation_fn": "torch.nn.modules.linear.Identity",
195
+ "pos_weight": null
196
+ }
197
+ ```
198
+
199
+ ### Training Hyperparameters
200
+ #### Non-Default Hyperparameters
201
+
202
+ - `per_device_train_batch_size`: 16
203
+ - `per_device_eval_batch_size`: 16
204
+
205
+ #### All Hyperparameters
206
+ <details><summary>Click to expand</summary>
207
+
208
+ - `overwrite_output_dir`: False
209
+ - `do_predict`: False
210
+ - `eval_strategy`: no
211
+ - `prediction_loss_only`: True
212
+ - `per_device_train_batch_size`: 16
213
+ - `per_device_eval_batch_size`: 16
214
+ - `per_gpu_train_batch_size`: None
215
+ - `per_gpu_eval_batch_size`: None
216
+ - `gradient_accumulation_steps`: 1
217
+ - `eval_accumulation_steps`: None
218
+ - `torch_empty_cache_steps`: None
219
+ - `learning_rate`: 5e-05
220
+ - `weight_decay`: 0.0
221
+ - `adam_beta1`: 0.9
222
+ - `adam_beta2`: 0.999
223
+ - `adam_epsilon`: 1e-08
224
+ - `max_grad_norm`: 1
225
+ - `num_train_epochs`: 3
226
+ - `max_steps`: -1
227
+ - `lr_scheduler_type`: linear
228
+ - `lr_scheduler_kwargs`: {}
229
+ - `warmup_ratio`: 0.0
230
+ - `warmup_steps`: 0
231
+ - `log_level`: passive
232
+ - `log_level_replica`: warning
233
+ - `log_on_each_node`: True
234
+ - `logging_nan_inf_filter`: True
235
+ - `save_safetensors`: True
236
+ - `save_on_each_node`: False
237
+ - `save_only_model`: False
238
+ - `restore_callback_states_from_checkpoint`: False
239
+ - `no_cuda`: False
240
+ - `use_cpu`: False
241
+ - `use_mps_device`: False
242
+ - `seed`: 42
243
+ - `data_seed`: None
244
+ - `jit_mode_eval`: False
245
+ - `bf16`: False
246
+ - `fp16`: False
247
+ - `fp16_opt_level`: O1
248
+ - `half_precision_backend`: auto
249
+ - `bf16_full_eval`: False
250
+ - `fp16_full_eval`: False
251
+ - `tf32`: None
252
+ - `local_rank`: 0
253
+ - `ddp_backend`: None
254
+ - `tpu_num_cores`: None
255
+ - `tpu_metrics_debug`: False
256
+ - `debug`: []
257
+ - `dataloader_drop_last`: False
258
+ - `dataloader_num_workers`: 0
259
+ - `dataloader_prefetch_factor`: None
260
+ - `past_index`: -1
261
+ - `disable_tqdm`: False
262
+ - `remove_unused_columns`: True
263
+ - `label_names`: None
264
+ - `load_best_model_at_end`: False
265
+ - `ignore_data_skip`: False
266
+ - `fsdp`: []
267
+ - `fsdp_min_num_params`: 0
268
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
269
+ - `fsdp_transformer_layer_cls_to_wrap`: None
270
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
271
+ - `parallelism_config`: None
272
+ - `deepspeed`: None
273
+ - `label_smoothing_factor`: 0.0
274
+ - `optim`: adamw_torch_fused
275
+ - `optim_args`: None
276
+ - `adafactor`: False
277
+ - `group_by_length`: False
278
+ - `length_column_name`: length
279
+ - `project`: huggingface
280
+ - `trackio_space_id`: trackio
281
+ - `ddp_find_unused_parameters`: None
282
+ - `ddp_bucket_cap_mb`: None
283
+ - `ddp_broadcast_buffers`: False
284
+ - `dataloader_pin_memory`: True
285
+ - `dataloader_persistent_workers`: False
286
+ - `skip_memory_metrics`: True
287
+ - `use_legacy_prediction_loop`: False
288
+ - `push_to_hub`: False
289
+ - `resume_from_checkpoint`: None
290
+ - `hub_model_id`: None
291
+ - `hub_strategy`: every_save
292
+ - `hub_private_repo`: None
293
+ - `hub_always_push`: False
294
+ - `hub_revision`: None
295
+ - `gradient_checkpointing`: False
296
+ - `gradient_checkpointing_kwargs`: None
297
+ - `include_inputs_for_metrics`: False
298
+ - `include_for_metrics`: []
299
+ - `eval_do_concat_batches`: True
300
+ - `fp16_backend`: auto
301
+ - `push_to_hub_model_id`: None
302
+ - `push_to_hub_organization`: None
303
+ - `mp_parameters`:
304
+ - `auto_find_batch_size`: False
305
+ - `full_determinism`: False
306
+ - `torchdynamo`: None
307
+ - `ray_scope`: last
308
+ - `ddp_timeout`: 1800
309
+ - `torch_compile`: False
310
+ - `torch_compile_backend`: None
311
+ - `torch_compile_mode`: None
312
+ - `include_tokens_per_second`: False
313
+ - `include_num_input_tokens_seen`: no
314
+ - `neftune_noise_alpha`: None
315
+ - `optim_target_modules`: None
316
+ - `batch_eval_metrics`: False
317
+ - `eval_on_start`: False
318
+ - `use_liger_kernel`: False
319
+ - `liger_kernel_config`: None
320
+ - `eval_use_gather_object`: False
321
+ - `average_tokens_across_devices`: True
322
+ - `prompts`: None
323
+ - `batch_sampler`: batch_sampler
324
+ - `multi_dataset_batch_sampler`: proportional
325
+ - `router_mapping`: {}
326
+ - `learning_rate_mapping`: {}
327
+
328
+ </details>
329
+
330
+ ### Training Logs
331
+ | Epoch | Step | Training Loss | paws-val-judge_average_precision |
332
+ |:------:|:----:|:-------------:|:--------------------------------:|
333
+ | 0.1852 | 500 | 0.3758 | - |
334
+ | 0.3704 | 1000 | 0.226 | - |
335
+ | 0.5556 | 1500 | 0.2176 | - |
336
+ | 0.7407 | 2000 | 0.1778 | - |
337
+ | 0.9259 | 2500 | 0.1757 | - |
338
+ | 1.0 | 2700 | - | 0.9826 |
339
+ | 1.1111 | 3000 | 0.1494 | - |
340
+ | 1.2963 | 3500 | 0.1271 | - |
341
+ | 1.4815 | 4000 | 0.1197 | - |
342
+ | 1.6667 | 4500 | 0.1263 | - |
343
+ | 1.8519 | 5000 | 0.116 | - |
344
+ | 2.0 | 5400 | - | 0.9852 |
345
+ | 2.0370 | 5500 | 0.1084 | - |
346
+ | 2.2222 | 6000 | 0.0707 | - |
347
+ | 2.4074 | 6500 | 0.0741 | - |
348
+ | 2.5926 | 7000 | 0.0713 | - |
349
+ | 2.7778 | 7500 | 0.0723 | - |
350
+ | 2.9630 | 8000 | 0.0727 | - |
351
+ | 3.0 | 8100 | - | 0.9870 |
352
+
353
+
354
+ ### Framework Versions
355
+ - Python: 3.12.12
356
+ - Sentence Transformers: 5.2.0
357
+ - Transformers: 4.57.3
358
+ - PyTorch: 2.9.0+cu126
359
+ - Accelerate: 1.12.0
360
+ - Datasets: 4.0.0
361
+ - Tokenizers: 0.22.1
362
+
363
+ ## Citation
364
+
365
+ ### BibTeX
366
+
367
+ #### Sentence Transformers
368
+ ```bibtex
369
+ @inproceedings{reimers-2019-sentence-bert,
370
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
371
+ author = "Reimers, Nils and Gurevych, Iryna",
372
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
373
+ month = "11",
374
+ year = "2019",
375
+ publisher = "Association for Computational Linguistics",
376
+ url = "https://arxiv.org/abs/1908.10084",
377
+ }
378
+ ```
379
+
380
+ <!--
381
+ ## Glossary
382
+
383
+ *Clearly define terms in order to be accessible across audiences.*
384
+ -->
385
+
386
+ <!--
387
+ ## Model Card Authors
388
+
389
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
390
+ -->
391
+
392
+ <!--
393
+ ## Model Card Contact
394
+
395
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
396
+ -->
added_tokens.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "[MASK]": 128000
3
+ }
config.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "DebertaV2ForSequenceClassification"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "bos_token_id": 1,
7
+ "dtype": "float32",
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-07,
21
+ "legacy": true,
22
+ "max_position_embeddings": 512,
23
+ "max_relative_positions": -1,
24
+ "model_type": "deberta-v2",
25
+ "norm_rel_ebd": "layer_norm",
26
+ "num_attention_heads": 12,
27
+ "num_hidden_layers": 12,
28
+ "pad_token_id": 0,
29
+ "pooler_dropout": 0,
30
+ "pooler_hidden_act": "gelu",
31
+ "pooler_hidden_size": 768,
32
+ "pos_att_type": [
33
+ "p2c",
34
+ "c2p"
35
+ ],
36
+ "position_biased_input": false,
37
+ "position_buckets": 256,
38
+ "relative_attention": true,
39
+ "sentence_transformers": {
40
+ "activation_fn": "torch.nn.modules.activation.Sigmoid",
41
+ "version": "5.2.0"
42
+ },
43
+ "share_att_key": true,
44
+ "transformers_version": "4.57.3",
45
+ "type_vocab_size": 0,
46
+ "vocab_size": 128100
47
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2864cf94240701754335466e4d9584b2c9ad1467c78c7c162a9592922e4d85fd
3
+ size 737716196
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "[CLS]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "[SEP]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "[MASK]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "[PAD]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "[SEP]",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": true,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
spm.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c679fbf93643d19aab7ee10c0b99e460bdbc02fedf34b92b05af343b4af586fd
3
+ size 2464616
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "[CLS]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "[SEP]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "128000": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "[CLS]",
45
+ "clean_up_tokenization_spaces": false,
46
+ "cls_token": "[CLS]",
47
+ "do_lower_case": false,
48
+ "eos_token": "[SEP]",
49
+ "extra_special_tokens": {},
50
+ "mask_token": "[MASK]",
51
+ "model_max_length": 512,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "sp_model_kwargs": {},
55
+ "split_by_punct": false,
56
+ "tokenizer_class": "DebertaV2Tokenizer",
57
+ "unk_token": "[UNK]",
58
+ "vocab_type": "spm"
59
+ }