End of training

Browse files

Files changed (4) hide show

README.md +91 -0
config.json +35 -0
model.safetensors +3 -0
training_args.bin +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,91 @@

+---
+license: mit
+base_model: vinai/bertweet-base
+tags:
+- generated_from_trainer
+metrics:
+- f1
+- precision
+- recall
+- accuracy
+model-index:
+- name: bertweet-base_regression_7_seed7_EN
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# bertweet-base_regression_7_seed7_EN
+This model is a fine-tuned version of [vinai/bertweet-base](https://huggingface.co/vinai/bertweet-base) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.0534
+- Mse: 5.3168
+- Rmse: 2.3058
+- Mae: 1.3446
+- R2: 0.2546
+- F1: 0.7813
+- Precision: 0.7833
+- Recall: 0.7850
+- Accuracy: 0.7850
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-06
+- train_batch_size: 16
+- eval_batch_size: 16
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 200
+- num_epochs: 10
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Mse    | Rmse   | Mae    | R2      | F1     | Precision | Recall | Accuracy |
+|:-------------:|:------:|:----:|:---------------:|:------:|:------:|:------:|:-------:|:------:|:---------:|:------:|:--------:|
+| 1.7289        | 0.4630 | 100  | 1.8447          | 9.6783 | 3.1110 | 2.2934 | -0.3936 | 0.4570 | 0.3669    | 0.6057 | 0.6057   |
+| 1.6915        | 0.9259 | 200  | 1.8018          | 8.9257 | 2.9876 | 2.2727 | -0.2852 | 0.4570 | 0.3669    | 0.6057 | 0.6057   |
+| 1.6504        | 1.3889 | 300  | 1.6921          | 7.9979 | 2.8280 | 2.1201 | -0.1516 | 0.4570 | 0.3669    | 0.6057 | 0.6057   |
+| 1.4742        | 1.8519 | 400  | 1.5208          | 6.7985 | 2.6074 | 1.8924 | 0.0211  | 0.4570 | 0.3669    | 0.6057 | 0.6057   |
+| 1.2813        | 2.3148 | 500  | 1.4051          | 6.0745 | 2.4647 | 1.7582 | 0.1253  | 0.4570 | 0.3669    | 0.6057 | 0.6057   |
+| 1.2569        | 2.7778 | 600  | 1.3427          | 5.7894 | 2.4061 | 1.6773 | 0.1664  | 0.4570 | 0.3669    | 0.6057 | 0.6057   |
+| 1.1022        | 3.2407 | 700  | 1.2733          | 5.3487 | 2.3127 | 1.6098 | 0.2298  | 0.4570 | 0.3669    | 0.6057 | 0.6057   |
+| 1.005         | 3.7037 | 800  | 1.2122          | 4.6814 | 2.1637 | 1.5743 | 0.3259  | 0.4570 | 0.3669    | 0.6057 | 0.6057   |
+| 0.9814        | 4.1667 | 900  | 1.1237          | 4.6217 | 2.1498 | 1.4468 | 0.3345  | 0.7191 | 0.7827    | 0.7467 | 0.7467   |
+| 0.8354        | 4.6296 | 1000 | 1.1104          | 4.8814 | 2.2094 | 1.4152 | 0.2971  | 0.7799 | 0.8035    | 0.7911 | 0.7911   |
+| 0.8169        | 5.0926 | 1100 | 1.0778          | 4.5463 | 2.1322 | 1.4105 | 0.3454  | 0.7926 | 0.7955    | 0.7963 | 0.7963   |
+| 0.7157        | 5.5556 | 1200 | 1.0451          | 4.6311 | 2.1520 | 1.3624 | 0.3332  | 0.7979 | 0.8010    | 0.8016 | 0.8016   |
+| 0.6988        | 6.0185 | 1300 | 1.0387          | 4.5981 | 2.1443 | 1.3629 | 0.3379  | 0.7823 | 0.7843    | 0.7859 | 0.7859   |
+| 0.6048        | 6.4815 | 1400 | 1.0342          | 4.8049 | 2.1920 | 1.3377 | 0.3081  | 0.7823 | 0.7843    | 0.7859 | 0.7859   |
+| 0.5695        | 6.9444 | 1500 | 1.0254          | 4.9339 | 2.2212 | 1.3273 | 0.2896  | 0.7844 | 0.7875    | 0.7885 | 0.7885   |
+| 0.5511        | 7.4074 | 1600 | 1.0084          | 5.0070 | 2.2376 | 1.3057 | 0.2790  | 0.7899 | 0.7992    | 0.7963 | 0.7963   |
+| 0.5313        | 7.8704 | 1700 | 1.0131          | 4.9329 | 2.2210 | 1.3143 | 0.2897  | 0.7859 | 0.7866    | 0.7885 | 0.7885   |
+| 0.4934        | 8.3333 | 1800 | 0.9928          | 4.8939 | 2.2122 | 1.2763 | 0.2953  | 0.7913 | 0.7970    | 0.7963 | 0.7963   |
+| 0.4688        | 8.7963 | 1900 | 1.0044          | 4.9293 | 2.2202 | 1.2969 | 0.2902  | 0.7859 | 0.7866    | 0.7885 | 0.7885   |
+| 0.4762        | 9.2593 | 2000 | 0.9942          | 4.8929 | 2.2120 | 1.2808 | 0.2955  | 0.7922 | 0.7959    | 0.7963 | 0.7963   |
+| 0.457         | 9.7222 | 2100 | 1.0085          | 4.9806 | 2.2317 | 1.2983 | 0.2828  | 0.7884 | 0.7894    | 0.7911 | 0.7911   |
+### Framework versions
+- Transformers 4.40.2
+- Pytorch 2.1.2
+- Datasets 2.18.0
+- Tokenizers 0.19.1

config.json ADDED Viewed

	@@ -0,0 +1,35 @@

+{
+  "_name_or_path": "vinai/bertweet-base",
+  "architectures": [
+    "RobertaForSequenceClassification"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "bos_token_id": 0,
+  "classifier_dropout": null,
+  "eos_token_id": 2,
+  "gradient_checkpointing": false,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "LABEL_0"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "label2id": {
+    "LABEL_0": 0
+  },
+  "layer_norm_eps": 1e-05,
+  "max_position_embeddings": 130,
+  "model_type": "roberta",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 1,
+  "position_embedding_type": "absolute",
+  "tokenizer_class": "BertweetTokenizer",
+  "torch_dtype": "float32",
+  "transformers_version": "4.40.2",
+  "type_vocab_size": 1,
+  "use_cache": true,
+  "vocab_size": 64001
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:761b762e5b54cb0bfddba3e9a56b2822958fbd6b8a9375db95e0586e08fe6b2a
+size 539627092

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2894999c7911f829549d66d96c73322ac1aeeb02842d40175061d29fd41f6249
+size 4984