End of training

Browse files

Files changed (6) hide show

README.md +48 -74
config.json +27 -24
model.safetensors +2 -2
runs/May01_20-01-50_Amal/events.out.tfevents.1714586517.Amal.31556.9 +3 -0
runs/May01_20-02-18_Amal/events.out.tfevents.1714586539.Amal.31556.10 +3 -0
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -1,74 +1,48 @@
----
-tags:
-- generated_from_trainer
-model-index:
-- name: Bert-MLM
-  results: []
----
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# Bert-MLM
-This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 7.7544
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 5e-05
-- train_batch_size: 8
-- eval_batch_size: 8
-- seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- training_steps: 1000
-### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| No log        | 0.02  | 50   | 9.3840          |
-| 9.3687        | 0.03  | 100  | 8.6014          |
-| 9.3687        | 0.05  | 150  | 8.2440          |
-| 8.0254        | 0.06  | 200  | 8.0843          |
-| 8.0254        | 0.08  | 250  | 8.0234          |
-| 7.8649        | 0.09  | 300  | 7.9828          |
-| 7.8649        | 0.11  | 350  | 7.9550          |
-| 7.732         | 0.12  | 400  | 7.9101          |
-| 7.732         | 0.14  | 450  | 7.8946          |
-| 7.6192        | 0.15  | 500  | 7.8525          |
-| 7.6192        | 0.17  | 550  | 7.8461          |
-| 7.6378        | 0.18  | 600  | 7.8285          |
-| 7.6378        | 0.2   | 650  | 7.8182          |
-| 7.6338        | 0.22  | 700  | 7.7917          |
-| 7.6338        | 0.23  | 750  | nan             |
-| 7.5994        | 0.25  | 800  | 7.7837          |
-| 7.5994        | 0.26  | 850  | 7.7596          |
-| 7.5323        | 0.28  | 900  | 7.7634          |
-| 7.5323        | 0.29  | 950  | 7.7750          |
-| 7.5914        | 0.31  | 1000 | 7.7544          |
-### Framework versions
-- Transformers 4.36.2
-- Pytorch 2.1.2+cu118
-- Datasets 2.16.0
-- Tokenizers 0.15.0

+---
+license: apache-2.0
+base_model: AmalNlal/my_awesome_eli5_mlm_model
+tags:
+- generated_from_trainer
+model-index:
+- name: Bert-MLM
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# Bert-MLM
+This model is a fine-tuned version of [AmalNlal/my_awesome_eli5_mlm_model](https://huggingface.co/AmalNlal/my_awesome_eli5_mlm_model) on the None dataset.
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2
+- train_batch_size: 4
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 1
+### Framework versions
+- Transformers 4.36.2
+- Pytorch 2.1.2+cu118
+- Datasets 2.16.0
+- Tokenizers 0.15.0

config.json CHANGED Viewed

@@ -1,24 +1,27 @@
-{
-  "architectures": [
-    "BertForMaskedLM"
-  ],
-  "attention_probs_dropout_prob": 0.1,
-  "classifier_dropout": null,
-  "hidden_act": "gelu",
-  "hidden_dropout_prob": 0.1,
-  "hidden_size": 504,
-  "initializer_range": 0.02,
-  "intermediate_size": 1024,
-  "layer_norm_eps": 1e-12,
-  "max_position_embeddings": 256,
-  "model_type": "bert",
-  "num_attention_heads": 12,
-  "num_hidden_layers": 12,
-  "pad_token_id": 0,
-  "position_embedding_type": "absolute",
-  "torch_dtype": "float32",
-  "transformers_version": "4.36.2",
-  "type_vocab_size": 2,
-  "use_cache": true,
-  "vocab_size": 50000
-}

+{
+  "_name_or_path": "AmalNlal/my_awesome_eli5_mlm_model",
+  "architectures": [
+    "RobertaForMaskedLM"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "bos_token_id": 0,
+  "classifier_dropout": null,
+  "eos_token_id": 2,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "layer_norm_eps": 1e-05,
+  "max_position_embeddings": 514,
+  "model_type": "roberta",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 6,
+  "pad_token_id": 1,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.36.2",
+  "type_vocab_size": 1,
+  "use_cache": true,
+  "vocab_size": 50265
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fe2d982cc67a1f6b1d059e5a1590aee15b846bd470839309ba1e54c81f3b5976
-size 201153152

 version https://git-lfs.github.com/spec/v1
+oid sha256:6680570641a2e5ae2f4df9dffb4cd40625ef41f382153604f34769bff561ef10
+size 328693404

runs/May01_20-01-50_Amal/events.out.tfevents.1714586517.Amal.31556.9 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b64bfb5562d8aaf7e17f679f425427d171f3e550d06bcb947e97bb8959c21f0e
+size 4350

runs/May01_20-02-18_Amal/events.out.tfevents.1714586539.Amal.31556.10 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2c1e579fb725222e9906cedcb80dc460f25627670df5c9aab73902dabcf5206a
+size 4346

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:63cdc91d4d15fc792951e823f4a5979a46afbd3fa7eb9a7873a2efc58156f9ac
-size 4728

 version https://git-lfs.github.com/spec/v1
+oid sha256:126327bbd43737948cb3e9c1c05ae62aed137a50d1f673f3fcde64c5b0fc7318
+size 4792