abte-resturants-distilbert-base-uncased

Browse files

Files changed (8) hide show

README.md +158 -0
config.json +34 -0
model.safetensors +3 -0
special_tokens_map.json +7 -0
tokenizer.json +0 -0
tokenizer_config.json +56 -0
training_args.bin +3 -0
vocab.txt +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,158 @@

+---
+library_name: transformers
+license: apache-2.0
+base_model: distilbert/distilbert-base-uncased
+tags:
+- generated_from_trainer
+model-index:
+- name: abte-restaurants-distilbert-base-uncased
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# abte-restaurants-distilbert-base-uncased
+This model is a fine-tuned version of [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.3605
+- F1-score: 0.8429
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 256
+- eval_batch_size: 256
+- seed: 42
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- num_epochs: 100
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | F1-score |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 0.6511        | 1.0   | 15   | 0.5160          | 0.0210   |
+| 0.3533        | 2.0   | 30   | 0.2970          | 0.5713   |
+| 0.2243        | 3.0   | 45   | 0.2558          | 0.6359   |
+| 0.1706        | 4.0   | 60   | 0.2319          | 0.6803   |
+| 0.1363        | 5.0   | 75   | 0.2149          | 0.7386   |
+| 0.0983        | 6.0   | 90   | 0.2058          | 0.7840   |
+| 0.0763        | 7.0   | 105  | 0.2034          | 0.8062   |
+| 0.0614        | 8.0   | 120  | 0.2150          | 0.8121   |
+| 0.0484        | 9.0   | 135  | 0.2192          | 0.8166   |
+| 0.0406        | 10.0  | 150  | 0.2291          | 0.8243   |
+| 0.0341        | 11.0  | 165  | 0.2317          | 0.8284   |
+| 0.0278        | 12.0  | 180  | 0.2352          | 0.8334   |
+| 0.0244        | 13.0  | 195  | 0.2480          | 0.8261   |
+| 0.0221        | 14.0  | 210  | 0.2546          | 0.8288   |
+| 0.0208        | 15.0  | 225  | 0.2558          | 0.8288   |
+| 0.0175        | 16.0  | 240  | 0.2678          | 0.8317   |
+| 0.0164        | 17.0  | 255  | 0.2712          | 0.8225   |
+| 0.0141        | 18.0  | 270  | 0.2635          | 0.8365   |
+| 0.0128        | 19.0  | 285  | 0.2720          | 0.8356   |
+| 0.012         | 20.0  | 300  | 0.2800          | 0.8332   |
+| 0.0118        | 21.0  | 315  | 0.2837          | 0.8378   |
+| 0.0115        | 22.0  | 330  | 0.2866          | 0.8378   |
+| 0.0108        | 23.0  | 345  | 0.2893          | 0.8354   |
+| 0.0099        | 24.0  | 360  | 0.2955          | 0.8362   |
+| 0.0087        | 25.0  | 375  | 0.2979          | 0.8353   |
+| 0.0082        | 26.0  | 390  | 0.2957          | 0.8393   |
+| 0.0074        | 27.0  | 405  | 0.3025          | 0.8391   |
+| 0.0072        | 28.0  | 420  | 0.3022          | 0.8376   |
+| 0.0079        | 29.0  | 435  | 0.3137          | 0.8360   |
+| 0.0066        | 30.0  | 450  | 0.3118          | 0.8338   |
+| 0.0068        | 31.0  | 465  | 0.3132          | 0.8424   |
+| 0.0073        | 32.0  | 480  | 0.3071          | 0.8413   |
+| 0.0059        | 33.0  | 495  | 0.3048          | 0.8365   |
+| 0.0064        | 34.0  | 510  | 0.3218          | 0.8407   |
+| 0.0083        | 35.0  | 525  | 0.3187          | 0.8392   |
+| 0.006         | 36.0  | 540  | 0.3218          | 0.8396   |
+| 0.0056        | 37.0  | 555  | 0.3167          | 0.8431   |
+| 0.0051        | 38.0  | 570  | 0.3160          | 0.8404   |
+| 0.006         | 39.0  | 585  | 0.3229          | 0.8421   |
+| 0.005         | 40.0  | 600  | 0.3178          | 0.8408   |
+| 0.0049        | 41.0  | 615  | 0.3275          | 0.8388   |
+| 0.005         | 42.0  | 630  | 0.3265          | 0.8409   |
+| 0.0048        | 43.0  | 645  | 0.3221          | 0.8403   |
+| 0.0047        | 44.0  | 660  | 0.3212          | 0.8402   |
+| 0.0044        | 45.0  | 675  | 0.3221          | 0.8413   |
+| 0.0049        | 46.0  | 690  | 0.3278          | 0.8405   |
+| 0.0046        | 47.0  | 705  | 0.3348          | 0.8408   |
+| 0.0044        | 48.0  | 720  | 0.3305          | 0.8414   |
+| 0.0038        | 49.0  | 735  | 0.3358          | 0.8420   |
+| 0.0052        | 50.0  | 750  | 0.3368          | 0.8416   |
+| 0.0042        | 51.0  | 765  | 0.3298          | 0.8410   |
+| 0.004         | 52.0  | 780  | 0.3412          | 0.8359   |
+| 0.0045        | 53.0  | 795  | 0.3404          | 0.8371   |
+| 0.004         | 54.0  | 810  | 0.3332          | 0.8410   |
+| 0.0041        | 55.0  | 825  | 0.3361          | 0.8428   |
+| 0.0036        | 56.0  | 840  | 0.3355          | 0.8413   |
+| 0.0041        | 57.0  | 855  | 0.3396          | 0.8413   |
+| 0.0039        | 58.0  | 870  | 0.3441          | 0.8412   |
+| 0.004         | 59.0  | 885  | 0.3437          | 0.8419   |
+| 0.0039        | 60.0  | 900  | 0.3470          | 0.8407   |
+| 0.0037        | 61.0  | 915  | 0.3478          | 0.8434   |
+| 0.0036        | 62.0  | 930  | 0.3499          | 0.8454   |
+| 0.0036        | 63.0  | 945  | 0.3492          | 0.8437   |
+| 0.0043        | 64.0  | 960  | 0.3477          | 0.8429   |
+| 0.0039        | 65.0  | 975  | 0.3431          | 0.8409   |
+| 0.0035        | 66.0  | 990  | 0.3474          | 0.8434   |
+| 0.004         | 67.0  | 1005 | 0.3478          | 0.8436   |
+| 0.0034        | 68.0  | 1020 | 0.3526          | 0.8421   |
+| 0.0035        | 69.0  | 1035 | 0.3514          | 0.8459   |
+| 0.0033        | 70.0  | 1050 | 0.3527          | 0.8443   |
+| 0.0036        | 71.0  | 1065 | 0.3485          | 0.8430   |
+| 0.0036        | 72.0  | 1080 | 0.3521          | 0.8456   |
+| 0.0036        | 73.0  | 1095 | 0.3535          | 0.8433   |
+| 0.0036        | 74.0  | 1110 | 0.3578          | 0.8405   |
+| 0.0031        | 75.0  | 1125 | 0.3609          | 0.8414   |
+| 0.0033        | 76.0  | 1140 | 0.3563          | 0.8426   |
+| 0.0033        | 77.0  | 1155 | 0.3561          | 0.8441   |
+| 0.0032        | 78.0  | 1170 | 0.3550          | 0.8423   |
+| 0.0032        | 79.0  | 1185 | 0.3554          | 0.8414   |
+| 0.0031        | 80.0  | 1200 | 0.3554          | 0.8404   |
+| 0.0039        | 81.0  | 1215 | 0.3549          | 0.8413   |
+| 0.0034        | 82.0  | 1230 | 0.3548          | 0.8405   |
+| 0.0029        | 83.0  | 1245 | 0.3575          | 0.8443   |
+| 0.0032        | 84.0  | 1260 | 0.3579          | 0.8416   |
+| 0.0029        | 85.0  | 1275 | 0.3603          | 0.8408   |
+| 0.0031        | 86.0  | 1290 | 0.3611          | 0.8445   |
+| 0.0031        | 87.0  | 1305 | 0.3612          | 0.8444   |
+| 0.0029        | 88.0  | 1320 | 0.3620          | 0.8447   |
+| 0.0032        | 89.0  | 1335 | 0.3594          | 0.8416   |
+| 0.0041        | 90.0  | 1350 | 0.3586          | 0.8423   |
+| 0.0032        | 91.0  | 1365 | 0.3599          | 0.8423   |
+| 0.0031        | 92.0  | 1380 | 0.3598          | 0.8409   |
+| 0.0033        | 93.0  | 1395 | 0.3593          | 0.8424   |
+| 0.0029        | 94.0  | 1410 | 0.3593          | 0.8422   |
+| 0.003         | 95.0  | 1425 | 0.3607          | 0.8426   |
+| 0.0028        | 96.0  | 1440 | 0.3610          | 0.8449   |
+| 0.0029        | 97.0  | 1455 | 0.3607          | 0.8424   |
+| 0.003         | 98.0  | 1470 | 0.3609          | 0.8422   |
+| 0.0029        | 99.0  | 1485 | 0.3606          | 0.8433   |
+| 0.003         | 100.0 | 1500 | 0.3605          | 0.8429   |
+### Framework versions
+- Transformers 4.48.3
+- Pytorch 2.5.1+cu124
+- Datasets 3.2.0
+- Tokenizers 0.21.0

config.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+  "_name_or_path": "distilbert/distilbert-base-uncased",
+  "activation": "gelu",
+  "architectures": [
+    "DistilBertForTokenClassification"
+  ],
+  "attention_dropout": 0.1,
+  "dim": 768,
+  "dropout": 0.1,
+  "hidden_dim": 3072,
+  "id2label": {
+    "0": "O",
+    "1": "B-Term",
+    "2": "I-Term"
+  },
+  "initializer_range": 0.02,
+  "label2id": {
+    "B-Term": 1,
+    "I-Term": 2,
+    "O": 0
+  },
+  "max_position_embeddings": 512,
+  "model_type": "distilbert",
+  "n_heads": 12,
+  "n_layers": 6,
+  "pad_token_id": 0,
+  "qa_dropout": 0.1,
+  "seq_classif_dropout": 0.2,
+  "sinusoidal_pos_embds": false,
+  "tie_weights_": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.48.3",
+  "vocab_size": 30522
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:12d395ea0c8bb8afff0a739e2ab5826ead88c6c961cbbc9a87392a7de820a3c9
+size 265473092

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "cls_token": "[CLS]",
+  "mask_token": "[MASK]",
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "unk_token": "[UNK]"
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,56 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": false,
+  "cls_token": "[CLS]",
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "DistilBertTokenizer",
+  "unk_token": "[UNK]"
+}

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0ecf67565b542bd0c7a5adb3c402743548a98b904968a03f3a858036e0c09bb1
+size 5304

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff