software-si commited on
Commit
e9dea78
·
verified ·
1 Parent(s): 023680c

Add new CrossEncoder model

Browse files
Files changed (7) hide show
  1. README.md +48 -30
  2. config.json +15 -28
  3. model.safetensors +2 -2
  4. special_tokens_map.json +1 -15
  5. tokenizer.json +0 -0
  6. tokenizer_config.json +17 -17
  7. vocab.txt +0 -0
README.md CHANGED
@@ -6,22 +6,22 @@ tags:
6
  - generated_from_trainer
7
  - dataset_size:553491
8
  - loss:CrossEntropyLoss
9
- base_model: cross-encoder/nli-deberta-v3-base
10
  datasets:
11
  - software-si/kitchen-nli-it
12
  pipeline_tag: text-classification
13
  library_name: sentence-transformers
14
  ---
15
 
16
- # CrossEncoder based on cross-encoder/nli-deberta-v3-base
17
 
18
- This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [cross-encoder/nli-deberta-v3-base](https://huggingface.co/cross-encoder/nli-deberta-v3-base) on the [kitchen-nli-it](https://huggingface.co/datasets/software-si/kitchen-nli-it) dataset using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text pair classification.
19
 
20
  ## Model Details
21
 
22
  ### Model Description
23
  - **Model Type:** Cross Encoder
24
- - **Base model:** [cross-encoder/nli-deberta-v3-base](https://huggingface.co/cross-encoder/nli-deberta-v3-base) <!-- at revision 6c749ce3425cd33b46d187e45b92bbf96ee12ec7 -->
25
  - **Maximum Sequence Length:** 512 tokens
26
  - **Number of Output Labels:** 3 labels
27
  - **Training Dataset:**
@@ -54,11 +54,11 @@ from sentence_transformers import CrossEncoder
54
  model = CrossEncoder("software-si/kitchen-it-nli")
55
  # Get scores for pairs of texts
56
  pairs = [
57
- ['modulo cucina lato posteriore di 70 cm, teglia dimensione gn1/1 due zone operative, completo di forno elettrico,', 'le zone cottura sono 4'],
58
- ['unità di cottura unità posizionata su vano a giorno, quattro punti cottura, con piastre quadre,', 'il vano della cucina è aperto'],
59
- ['unità di cottura disposta su forno a gas, con piastre tonde, dotata di quattro piastre di cottura,', 'la cucina è disposta su vano a giorno'],
60
- ['piano cottura con le piastre quadre, dotata di quattro piastre di cottura, con teglie di gn1/1 70 cm di lato,', 'la cucina è alimentata a gas'],
61
- ['modulo cucina sistema di cottura elettrico, quattro piastre di cottura, con teglie di gn1/1 profondità pari a 90 cm,', 'la cucina ha due zone cottura'],
62
  ]
63
  scores = model.predict(pairs)
64
  print(scores.shape)
@@ -114,13 +114,13 @@ You can finetune this model on your own dataset.
114
  | | premises | hypothesis | labels |
115
  |:--------|:-------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------|
116
  | type | string | string | int |
117
- | details | <ul><li>min: 45 characters</li><li>mean: 104.07 characters</li><li>max: 153 characters</li></ul> | <ul><li>min: 12 characters</li><li>mean: 32.84 characters</li><li>max: 50 characters</li></ul> | <ul><li>0: ~30.90%</li><li>1: ~36.50%</li><li>2: ~32.60%</li></ul> |
118
  * Samples:
119
- | premises | hypothesis | labels |
120
- |:------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------|:---------------|
121
- | <code>piano cottura forno a funzionamento elettrico, configurazione a 2 piastre, modulo con piastre tonde,</code> | <code>la cucina ha le piastre quadre</code> | <code>0</code> |
122
- | <code>cucina profondità standard 70 cm, dotata di piastre tonde, 6 bruciatori,</code> | <code>la cucina è profonda 70 cm</code> | <code>1</code> |
123
- | <code>cucina fornita di sei fuochi cottura, con teglie di gn1/1 70 cm di lato, operativa a induzione,</code> | <code>le piastre della cucina sono di forma quadrata</code> | <code>2</code> |
124
  * Loss: [<code>CrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#crossentropyloss)
125
 
126
  ### Evaluation Dataset
@@ -134,13 +134,13 @@ You can finetune this model on your own dataset.
134
  | | premises | hypothesis | labels |
135
  |:--------|:-------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------|
136
  | type | string | string | int |
137
- | details | <ul><li>min: 43 characters</li><li>mean: 103.98 characters</li><li>max: 159 characters</li></ul> | <ul><li>min: 12 characters</li><li>mean: 33.46 characters</li><li>max: 50 characters</li></ul> | <ul><li>0: ~30.00%</li><li>1: ~38.50%</li><li>2: ~31.50%</li></ul> |
138
  * Samples:
139
- | premises | hypothesis | labels |
140
- |:------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------|:---------------|
141
- | <code>modulo cucina lato posteriore di 70 cm, teglia dimensione gn1/1 due zone operative, completo di forno elettrico,</code> | <code>le zone cottura sono 4</code> | <code>0</code> |
142
- | <code>unità di cottura unità posizionata su vano a giorno, quattro punti cottura, con piastre quadre,</code> | <code>il vano della cucina è aperto</code> | <code>1</code> |
143
- | <code>unità di cottura disposta su forno a gas, con piastre tonde, dotata di quattro piastre di cottura,</code> | <code>la cucina è disposta su vano a giorno</code> | <code>2</code> |
144
  * Loss: [<code>CrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#crossentropyloss)
145
 
146
  ### Training Hyperparameters
@@ -195,7 +195,6 @@ You can finetune this model on your own dataset.
195
  - `seed`: 42
196
  - `data_seed`: None
197
  - `jit_mode_eval`: False
198
- - `use_ipex`: False
199
  - `bf16`: True
200
  - `fp16`: False
201
  - `fp16_opt_level`: O1
@@ -230,6 +229,8 @@ You can finetune this model on your own dataset.
230
  - `adafactor`: False
231
  - `group_by_length`: False
232
  - `length_column_name`: length
 
 
233
  - `ddp_find_unused_parameters`: None
234
  - `ddp_bucket_cap_mb`: None
235
  - `ddp_broadcast_buffers`: False
@@ -262,7 +263,7 @@ You can finetune this model on your own dataset.
262
  - `torch_compile_backend`: None
263
  - `torch_compile_mode`: None
264
  - `include_tokens_per_second`: False
265
- - `include_num_input_tokens_seen`: False
266
  - `neftune_noise_alpha`: None
267
  - `optim_target_modules`: None
268
  - `batch_eval_metrics`: False
@@ -270,7 +271,7 @@ You can finetune this model on your own dataset.
270
  - `use_liger_kernel`: False
271
  - `liger_kernel_config`: None
272
  - `eval_use_gather_object`: False
273
- - `average_tokens_across_devices`: False
274
  - `prompts`: None
275
  - `batch_sampler`: batch_sampler
276
  - `multi_dataset_batch_sampler`: proportional
@@ -282,16 +283,33 @@ You can finetune this model on your own dataset.
282
  ### Training Logs
283
  | Epoch | Step | Training Loss | Validation Loss |
284
  |:------:|:----:|:-------------:|:---------------:|
285
- | 0.2312 | 2000 | 0.8441 | 0.3763 |
286
- | 0.4625 | 4000 | 0.2613 | 0.1401 |
287
- | 0.6937 | 6000 | 0.1298 | 0.0938 |
288
- | 0.9250 | 8000 | 0.0945 | 0.0853 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
289
 
290
 
291
  ### Framework Versions
292
- - Python: 3.12.11
293
  - Sentence Transformers: 5.1.1
294
- - Transformers: 4.56.2
295
  - PyTorch: 2.8.0+cu128
296
  - Accelerate: 1.10.1
297
  - Datasets: 4.1.1
 
6
  - generated_from_trainer
7
  - dataset_size:553491
8
  - loss:CrossEntropyLoss
9
+ base_model: dbmdz/bert-base-italian-uncased
10
  datasets:
11
  - software-si/kitchen-nli-it
12
  pipeline_tag: text-classification
13
  library_name: sentence-transformers
14
  ---
15
 
16
+ # CrossEncoder based on dbmdz/bert-base-italian-uncased
17
 
18
+ This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [dbmdz/bert-base-italian-uncased](https://huggingface.co/dbmdz/bert-base-italian-uncased) on the [kitchen-nli-it](https://huggingface.co/datasets/software-si/kitchen-nli-it) dataset using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text pair classification.
19
 
20
  ## Model Details
21
 
22
  ### Model Description
23
  - **Model Type:** Cross Encoder
24
+ - **Base model:** [dbmdz/bert-base-italian-uncased](https://huggingface.co/dbmdz/bert-base-italian-uncased) <!-- at revision 55058d75cf3bc75a67a412584491b774cb99d68a -->
25
  - **Maximum Sequence Length:** 512 tokens
26
  - **Number of Output Labels:** 3 labels
27
  - **Training Dataset:**
 
54
  model = CrossEncoder("software-si/kitchen-it-nli")
55
  # Get scores for pairs of texts
56
  pairs = [
57
+ ['piano cottura sopra forno preinstallato, dotata di 6 piastre di cottura, fornita di piastre quadrate, cucina alimentata a induzione,', 'la cucina è alimentata ad induzione'],
58
+ ['modulo cucina dimensione teglie di gn1/1 piastre di forma quadrata, di profondità 70 cm, con forno,', 'le piastre della cucina sono di tonde'],
59
+ ['modulo cucina modalità di alimentazione elettrica, con piastre tonde operative, forno alimentato a gas, 2 zone,', "l'alimentazione del forno è a gas"],
60
+ ['cucina con teglie di gn1/1 piastre tonde preinstallate, superficie di cottura elettrica, con forno incluso,', 'la cucina ha un forno integrato'],
61
+ ['cucina sei punti cottura, dimensione anteriore 70 cm, posta su vano, con cottura a gas,', 'la cucina è alimentata ad elettrico'],
62
  ]
63
  scores = model.predict(pairs)
64
  print(scores.shape)
 
114
  | | premises | hypothesis | labels |
115
  |:--------|:-------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------|
116
  | type | string | string | int |
117
+ | details | <ul><li>min: 51 characters</li><li>mean: 104.34 characters</li><li>max: 153 characters</li></ul> | <ul><li>min: 12 characters</li><li>mean: 33.34 characters</li><li>max: 50 characters</li></ul> | <ul><li>0: ~31.80%</li><li>1: ~37.40%</li><li>2: ~30.80%</li></ul> |
118
  * Samples:
119
+ | premises | hypothesis | labels |
120
+ |:----------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------|:---------------|
121
+ | <code>cucina con piastre tonde, 4 fuochi, su base con forno elettrico,</code> | <code>la cucina ha un forno elettrico</code> | <code>1</code> |
122
+ | <code>piano cottura profonda 90 cm, con sei zone cottura, piastre tonde incluse,</code> | <code>la cucina è profonda 90 cm</code> | <code>1</code> |
123
+ | <code>piano cottura dotata di 6 fuochi di cottura, di profondità 70 cm, con teglie di gn1/1 piastre tonde integrate,</code> | <code>la dimensione della teglie è di gn1/1</code> | <code>1</code> |
124
  * Loss: [<code>CrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#crossentropyloss)
125
 
126
  ### Evaluation Dataset
 
134
  | | premises | hypothesis | labels |
135
  |:--------|:-------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------|
136
  | type | string | string | int |
137
+ | details | <ul><li>min: 44 characters</li><li>mean: 103.86 characters</li><li>max: 149 characters</li></ul> | <ul><li>min: 12 characters</li><li>mean: 33.19 characters</li><li>max: 50 characters</li></ul> | <ul><li>0: ~31.60%</li><li>1: ~35.50%</li><li>2: ~32.90%</li></ul> |
138
  * Samples:
139
+ | premises | hypothesis | labels |
140
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------|:---------------|
141
+ | <code>piano cottura sopra forno preinstallato, dotata di 6 piastre di cottura, fornita di piastre quadrate, cucina alimentata a induzione,</code> | <code>la cucina è alimentata ad induzione</code> | <code>1</code> |
142
+ | <code>modulo cucina dimensione teglie di gn1/1 piastre di forma quadrata, di profondità 70 cm, con forno,</code> | <code>le piastre della cucina sono di tonde</code> | <code>0</code> |
143
+ | <code>modulo cucina modalità di alimentazione elettrica, con piastre tonde operative, forno alimentato a gas, 2 zone,</code> | <code>l'alimentazione del forno è a gas</code> | <code>1</code> |
144
  * Loss: [<code>CrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#crossentropyloss)
145
 
146
  ### Training Hyperparameters
 
195
  - `seed`: 42
196
  - `data_seed`: None
197
  - `jit_mode_eval`: False
 
198
  - `bf16`: True
199
  - `fp16`: False
200
  - `fp16_opt_level`: O1
 
229
  - `adafactor`: False
230
  - `group_by_length`: False
231
  - `length_column_name`: length
232
+ - `project`: huggingface
233
+ - `trackio_space_id`: trackio
234
  - `ddp_find_unused_parameters`: None
235
  - `ddp_bucket_cap_mb`: None
236
  - `ddp_broadcast_buffers`: False
 
263
  - `torch_compile_backend`: None
264
  - `torch_compile_mode`: None
265
  - `include_tokens_per_second`: False
266
+ - `include_num_input_tokens_seen`: no
267
  - `neftune_noise_alpha`: None
268
  - `optim_target_modules`: None
269
  - `batch_eval_metrics`: False
 
271
  - `use_liger_kernel`: False
272
  - `liger_kernel_config`: None
273
  - `eval_use_gather_object`: False
274
+ - `average_tokens_across_devices`: True
275
  - `prompts`: None
276
  - `batch_sampler`: batch_sampler
277
  - `multi_dataset_batch_sampler`: proportional
 
283
  ### Training Logs
284
  | Epoch | Step | Training Loss | Validation Loss |
285
  |:------:|:----:|:-------------:|:---------------:|
286
+ | 0.0462 | 400 | 1.1097 | 1.0908 |
287
+ | 0.0925 | 800 | 1.074 | 1.0388 |
288
+ | 0.1387 | 1200 | 1.0096 | 0.9463 |
289
+ | 0.1850 | 1600 | 0.9181 | 0.8411 |
290
+ | 0.2312 | 2000 | 0.8197 | 0.7405 |
291
+ | 0.2775 | 2400 | 0.7356 | 0.6496 |
292
+ | 0.3237 | 2800 | 0.6549 | 0.5535 |
293
+ | 0.3700 | 3200 | 0.5595 | 0.4527 |
294
+ | 0.4162 | 3600 | 0.4713 | 0.3730 |
295
+ | 0.4625 | 4000 | 0.3963 | 0.3116 |
296
+ | 0.5087 | 4400 | 0.3393 | 0.2627 |
297
+ | 0.5550 | 4800 | 0.2966 | 0.2278 |
298
+ | 0.6012 | 5200 | 0.2574 | 0.1980 |
299
+ | 0.6475 | 5600 | 0.2278 | 0.1759 |
300
+ | 0.6937 | 6000 | 0.2147 | 0.1613 |
301
+ | 0.7400 | 6400 | 0.1944 | 0.1466 |
302
+ | 0.7862 | 6800 | 0.1754 | 0.1387 |
303
+ | 0.8325 | 7200 | 0.1658 | 0.1312 |
304
+ | 0.8787 | 7600 | 0.1514 | 0.1244 |
305
+ | 0.9250 | 8000 | 0.143 | 0.1133 |
306
+ | 0.9712 | 8400 | 0.1313 | 0.1095 |
307
 
308
 
309
  ### Framework Versions
310
+ - Python: 3.12.3
311
  - Sentence Transformers: 5.1.1
312
+ - Transformers: 4.57.0
313
  - PyTorch: 2.8.0+cu128
314
  - Accelerate: 1.10.1
315
  - Datasets: 4.1.1
config.json CHANGED
@@ -1,51 +1,38 @@
1
  {
2
  "architectures": [
3
- "DebertaV2ForSequenceClassification"
4
  ],
5
  "attention_probs_dropout_prob": 0.1,
6
- "bos_token_id": 1,
7
  "dtype": "float32",
8
- "eos_token_id": 2,
9
  "hidden_act": "gelu",
10
  "hidden_dropout_prob": 0.1,
11
  "hidden_size": 768,
12
  "id2label": {
13
- "0": "contradiction",
14
- "1": "entailment",
15
- "2": "neutral"
16
  },
17
  "initializer_range": 0.02,
18
  "intermediate_size": 3072,
19
  "label2id": {
20
- "contradiction": 0,
21
- "entailment": 1,
22
- "neutral": 2
23
  },
24
- "layer_norm_eps": 1e-07,
25
- "legacy": true,
26
  "max_position_embeddings": 512,
27
- "max_relative_positions": -1,
28
- "model_type": "deberta-v2",
29
- "norm_rel_ebd": "layer_norm",
30
  "num_attention_heads": 12,
31
  "num_hidden_layers": 12,
32
  "pad_token_id": 0,
33
- "pooler_dropout": 0,
34
- "pooler_hidden_act": "gelu",
35
- "pooler_hidden_size": 768,
36
- "pos_att_type": [
37
- "p2c",
38
- "c2p"
39
- ],
40
- "position_biased_input": false,
41
- "position_buckets": 256,
42
- "relative_attention": true,
43
  "sentence_transformers": {
44
  "activation_fn": "torch.nn.modules.linear.Identity",
45
  "version": "5.1.1"
46
  },
47
- "share_att_key": true,
48
- "transformers_version": "4.56.2",
49
- "type_vocab_size": 0,
50
- "vocab_size": 128100
51
  }
 
1
  {
2
  "architectures": [
3
+ "BertForSequenceClassification"
4
  ],
5
  "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
  "dtype": "float32",
 
8
  "hidden_act": "gelu",
9
  "hidden_dropout_prob": 0.1,
10
  "hidden_size": 768,
11
  "id2label": {
12
+ "0": "LABEL_0",
13
+ "1": "LABEL_1",
14
+ "2": "LABEL_2"
15
  },
16
  "initializer_range": 0.02,
17
  "intermediate_size": 3072,
18
  "label2id": {
19
+ "LABEL_0": 0,
20
+ "LABEL_1": 1,
21
+ "LABEL_2": 2
22
  },
23
+ "layer_norm_eps": 1e-12,
 
24
  "max_position_embeddings": 512,
25
+ "model_type": "bert",
 
 
26
  "num_attention_heads": 12,
27
  "num_hidden_layers": 12,
28
  "pad_token_id": 0,
29
+ "position_embedding_type": "absolute",
 
 
 
 
 
 
 
 
 
30
  "sentence_transformers": {
31
  "activation_fn": "torch.nn.modules.linear.Identity",
32
  "version": "5.1.1"
33
  },
34
+ "transformers_version": "4.57.0",
35
+ "type_vocab_size": 2,
36
+ "use_cache": true,
37
+ "vocab_size": 31102
38
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d28631f1a91c2549b3f9e7e41e00333d91990df9d63ec5e9d32e69b9482a9714
3
- size 737722356
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ab435a94f874ed86c96fb3378132063758db4b6a31cb40897f511bdb71dd72d5
3
+ size 439743484
special_tokens_map.json CHANGED
@@ -1,11 +1,4 @@
1
  {
2
- "bos_token": {
3
- "content": "[CLS]",
4
- "lstrip": false,
5
- "normalized": false,
6
- "rstrip": false,
7
- "single_word": false
8
- },
9
  "cls_token": {
10
  "content": "[CLS]",
11
  "lstrip": false,
@@ -13,13 +6,6 @@
13
  "rstrip": false,
14
  "single_word": false
15
  },
16
- "eos_token": {
17
- "content": "[SEP]",
18
- "lstrip": false,
19
- "normalized": false,
20
- "rstrip": false,
21
- "single_word": false
22
- },
23
  "mask_token": {
24
  "content": "[MASK]",
25
  "lstrip": false,
@@ -44,7 +30,7 @@
44
  "unk_token": {
45
  "content": "[UNK]",
46
  "lstrip": false,
47
- "normalized": true,
48
  "rstrip": false,
49
  "single_word": false
50
  }
 
1
  {
 
 
 
 
 
 
 
2
  "cls_token": {
3
  "content": "[CLS]",
4
  "lstrip": false,
 
6
  "rstrip": false,
7
  "single_word": false
8
  },
 
 
 
 
 
 
 
9
  "mask_token": {
10
  "content": "[MASK]",
11
  "lstrip": false,
 
30
  "unk_token": {
31
  "content": "[UNK]",
32
  "lstrip": false,
33
+ "normalized": false,
34
  "rstrip": false,
35
  "single_word": false
36
  }
tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json CHANGED
@@ -8,31 +8,31 @@
8
  "single_word": false,
9
  "special": true
10
  },
11
- "1": {
12
- "content": "[CLS]",
13
  "lstrip": false,
14
  "normalized": false,
15
  "rstrip": false,
16
  "single_word": false,
17
  "special": true
18
  },
19
- "2": {
20
- "content": "[SEP]",
21
  "lstrip": false,
22
  "normalized": false,
23
  "rstrip": false,
24
  "single_word": false,
25
  "special": true
26
  },
27
- "3": {
28
- "content": "[UNK]",
29
  "lstrip": false,
30
- "normalized": true,
31
  "rstrip": false,
32
  "single_word": false,
33
  "special": true
34
  },
35
- "128000": {
36
  "content": "[MASK]",
37
  "lstrip": false,
38
  "normalized": false,
@@ -41,26 +41,26 @@
41
  "special": true
42
  }
43
  },
44
- "bos_token": "[CLS]",
45
- "clean_up_tokenization_spaces": false,
46
  "cls_token": "[CLS]",
47
- "do_lower_case": false,
48
- "eos_token": "[SEP]",
49
  "extra_special_tokens": {},
50
  "mask_token": "[MASK]",
 
51
  "max_length": 512,
52
  "model_max_length": 512,
 
53
  "pad_to_multiple_of": null,
54
  "pad_token": "[PAD]",
55
  "pad_token_type_id": 0,
56
  "padding_side": "right",
57
  "sep_token": "[SEP]",
58
- "sp_model_kwargs": {},
59
- "split_by_punct": false,
60
  "stride": 0,
61
- "tokenizer_class": "DebertaV2TokenizerFast",
 
 
62
  "truncation_side": "right",
63
  "truncation_strategy": "longest_first",
64
- "unk_token": "[UNK]",
65
- "vocab_type": "spm"
66
  }
 
8
  "single_word": false,
9
  "special": true
10
  },
11
+ "101": {
12
+ "content": "[UNK]",
13
  "lstrip": false,
14
  "normalized": false,
15
  "rstrip": false,
16
  "single_word": false,
17
  "special": true
18
  },
19
+ "102": {
20
+ "content": "[CLS]",
21
  "lstrip": false,
22
  "normalized": false,
23
  "rstrip": false,
24
  "single_word": false,
25
  "special": true
26
  },
27
+ "103": {
28
+ "content": "[SEP]",
29
  "lstrip": false,
30
+ "normalized": false,
31
  "rstrip": false,
32
  "single_word": false,
33
  "special": true
34
  },
35
+ "104": {
36
  "content": "[MASK]",
37
  "lstrip": false,
38
  "normalized": false,
 
41
  "special": true
42
  }
43
  },
44
+ "clean_up_tokenization_spaces": true,
 
45
  "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
  "extra_special_tokens": {},
49
  "mask_token": "[MASK]",
50
+ "max_len": 512,
51
  "max_length": 512,
52
  "model_max_length": 512,
53
+ "never_split": null,
54
  "pad_to_multiple_of": null,
55
  "pad_token": "[PAD]",
56
  "pad_token_type_id": 0,
57
  "padding_side": "right",
58
  "sep_token": "[SEP]",
 
 
59
  "stride": 0,
60
+ "strip_accents": null,
61
+ "tokenize_chinese_chars": true,
62
+ "tokenizer_class": "BertTokenizer",
63
  "truncation_side": "right",
64
  "truncation_strategy": "longest_first",
65
+ "unk_token": "[UNK]"
 
66
  }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff