mekjr1 commited on
Commit
4d610bb
·
1 Parent(s): 336c7f4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -11
README.md CHANGED
@@ -5,52 +5,73 @@ tags:
5
  model-index:
6
  - name: mekjr1/guilbert-base-uncased
7
  results: []
 
 
 
 
8
  ---
9
 
10
- <!-- This model card has been generated automatically according to the information Keras had access to. You should
11
- probably proofread and complete it, then remove this comment. -->
 
12
 
13
  # mekjr1/guilbert-base-uncased
14
 
15
- This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an unknown dataset.
16
- It achieves the following results on the evaluation set:
17
- - Train Loss: 1.9616
18
- - Validation Loss: 1.8529
19
- - Epoch: 8
20
 
21
  ## Model description
22
 
23
- More information needed
24
 
25
  ## Intended uses & limitations
26
 
27
- More information needed
 
28
 
29
  ## Training and evaluation data
30
 
31
- More information needed
32
 
33
  ## Training procedure
34
 
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
 
 
 
 
 
 
 
 
 
38
  - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 2e-05, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 7167, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
39
  - training_precision: mixed_float16
40
 
41
  ### Training results
42
 
 
 
43
  | Train Loss | Validation Loss | Epoch |
44
  |:----------:|:---------------:|:-----:|
 
 
 
 
 
45
  | 1.9626 | 1.9024 | 5 |
46
  | 1.9574 | 1.8421 | 6 |
47
  | 1.9594 | 1.8632 | 7 |
48
  | 1.9616 | 1.8529 | 8 |
49
 
50
 
 
51
  ### Framework versions
52
 
53
  - Transformers 4.26.1
54
  - TensorFlow 2.11.0
55
  - Datasets 2.10.1
56
- - Tokenizers 0.13.2
 
5
  model-index:
6
  - name: mekjr1/guilbert-base-uncased
7
  results: []
8
+ datasets:
9
+ - mekjr1/guilbert_lm
10
+ language:
11
+ - en
12
  ---
13
 
14
+
15
+
16
+
17
 
18
  # mekjr1/guilbert-base-uncased
19
 
20
+ This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an guilbert dataset. It is a masked language model that predicts missing tokens in a sentence.
 
 
 
 
21
 
22
  ## Model description
23
 
24
+ The model is based on the `bert-base-uncased` architecture, which has 12 layers, 768 hidden units, and 12 attention heads. It has been fine-tuned on a dataset with samples labeled as guilt or non-guilt from the Vent dataset. The model was trained with a maximum sequence length of 128 tokens and a batch size of 32. The training process used the AdamW optimizer with a learning rate of 2e-5, a weight decay rate of 0.01, and a linear learning rate warmup over 1,000 steps. The model achieved a validation loss of 1.8529 after 8 epochs.
25
 
26
  ## Intended uses & limitations
27
 
28
+ This model can be used for predicting missing tokens in text sequences, particularly in the context of detecting guilt emotion in documents or other relevant applications.
29
+ However, the accuracy of the model may be limited by the quality and representativeness of the training data, as well as the biases present in the pre-trained `bert-base-uncased` architecture.
30
 
31
  ## Training and evaluation data
32
 
33
+ The model was trained on a dataset of samples labeled as guilt or non-guilt from the guilbert dataset (Extracted from Vent).
34
 
35
  ## Training procedure
36
 
37
+ The model was trained using TensorFlow Keras with the AdamW optimizer and a learning rate of 2e-5. The training process used a batch size of 32 and a maximum sequence length of 128 tokens. The optimizer used a weight decay rate of 0.01 and a linear learning rate warmup over 1,000 steps. The model was trained for 8 epochs, with early stopping based on the validation loss. The training process achieved a validation loss of 1.8529 after 8 epochs.
38
+
39
  ### Training hyperparameters
40
 
41
  The following hyperparameters were used during training:
42
+
43
+ - Optimizer: `AdamWeightDecay` with a learning rate of `WarmUp(initial_learning_rate=2e-05, decay_schedule_fn=PolynomialDecay(initial_learning_rate=2e-05, decay_steps=7167, end_learning_rate=0.0, power=1.0, cycle=False), warmup_steps=1000, power=1.0)`
44
+ - Weight decay rate: 0.01
45
+ - Batch size: 32
46
+ - Maximum sequence length: 128
47
+ - Number of warmup steps: 1,000
48
+ - Number of training steps: 1,761
49
+ The following hyperparameters were used during training:
50
+
51
  - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 2e-05, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 7167, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
52
  - training_precision: mixed_float16
53
 
54
  ### Training results
55
 
56
+ The following table shows the training and validation loss for each epoch:
57
+
58
  | Train Loss | Validation Loss | Epoch |
59
  |:----------:|:---------------:|:-----:|
60
+ | 2.0976 | 1.8593 | 0 |
61
+ | 1.9643 | 1.8547 | 1 |
62
+ | 1.9651 | 1.9003 | 2 |
63
+ | 1.9608 | 1.8617 | 3 |
64
+ | 1.9646 | 1.8756 | 4 |
65
  | 1.9626 | 1.9024 | 5 |
66
  | 1.9574 | 1.8421 | 6 |
67
  | 1.9594 | 1.8632 | 7 |
68
  | 1.9616 | 1.8529 | 8 |
69
 
70
 
71
+
72
  ### Framework versions
73
 
74
  - Transformers 4.26.1
75
  - TensorFlow 2.11.0
76
  - Datasets 2.10.1
77
+ - Tokenizers 0.13.2