agata-luk commited on
Commit
4c6ff2a
·
verified ·
1 Parent(s): 047331b

Training complete

Browse files
Files changed (2) hide show
  1. README.md +18 -11
  2. generation_config.json +2 -33
README.md CHANGED
@@ -5,6 +5,8 @@ base_model: google-t5/t5-small
5
  tags:
6
  - simplification
7
  - generated_from_trainer
 
 
8
  model-index:
9
  - name: t5-neutralization
10
  results: []
@@ -16,6 +18,10 @@ should probably proofread and complete it, then remove this comment. -->
16
  # t5-neutralization
17
 
18
  This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on an unknown dataset.
 
 
 
 
19
 
20
  ## Model description
21
 
@@ -35,23 +41,24 @@ More information needed
35
 
36
  The following hyperparameters were used during training:
37
  - learning_rate: 5.6e-05
38
- - train_batch_size: 18
39
- - eval_batch_size: 18
40
  - seed: 42
41
- - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
42
  - lr_scheduler_type: linear
43
- - num_epochs: 1
44
 
45
  ### Training results
46
 
47
- | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
48
- |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|
49
- | No log | 1.0 | 196 | 0.2997 | 52.9831 | 18.5417 |
 
50
 
51
 
52
  ### Framework versions
53
 
54
- - Transformers 5.2.0
55
- - Pytorch 2.10.0+cpu
56
- - Datasets 4.6.0
57
- - Tokenizers 0.22.2
 
5
  tags:
6
  - simplification
7
  - generated_from_trainer
8
+ metrics:
9
+ - bleu
10
  model-index:
11
  - name: t5-neutralization
12
  results: []
 
18
  # t5-neutralization
19
 
20
  This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on an unknown dataset.
21
+ It achieves the following results on the evaluation set:
22
+ - Loss: 0.1686
23
+ - Bleu: 53.022
24
+ - Gen Len: 18.5417
25
 
26
  ## Model description
27
 
 
41
 
42
  The following hyperparameters were used during training:
43
  - learning_rate: 5.6e-05
44
+ - train_batch_size: 8
45
+ - eval_batch_size: 8
46
  - seed: 42
47
+ - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
+ - num_epochs: 2
50
 
51
  ### Training results
52
 
53
+ | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
54
+ |:-------------:|:-----:|:----:|:---------------:|:------:|:-------:|
55
+ | No log | 1.0 | 440 | 0.1988 | 53.022 | 18.5417 |
56
+ | 0.2893 | 2.0 | 880 | 0.1686 | 53.022 | 18.5417 |
57
 
58
 
59
  ### Framework versions
60
 
61
+ - Transformers 4.51.2
62
+ - Pytorch 2.10.0+cu128
63
+ - Datasets 4.0.0
64
+ - Tokenizers 0.21.4
generation_config.json CHANGED
@@ -1,37 +1,6 @@
1
  {
2
- "_from_model_config": false,
3
- "assistant_confidence_threshold": 0.4,
4
- "assistant_lookbehind": 10,
5
  "decoder_start_token_id": 0,
6
- "diversity_penalty": 0.0,
7
- "do_sample": false,
8
- "early_stopping": false,
9
- "encoder_no_repeat_ngram_size": 0,
10
- "encoder_repetition_penalty": 1.0,
11
- "eos_token_id": [
12
- 1
13
- ],
14
- "epsilon_cutoff": 0.0,
15
- "eta_cutoff": 0.0,
16
- "length_penalty": 1.0,
17
- "max_length": 20,
18
- "min_length": 0,
19
- "no_repeat_ngram_size": 0,
20
- "num_assistant_tokens": 20,
21
- "num_assistant_tokens_schedule": "constant",
22
- "num_beam_groups": 1,
23
- "num_beams": 1,
24
- "num_return_sequences": 1,
25
- "output_scores": false,
26
  "pad_token_id": 0,
27
- "remove_invalid_values": false,
28
- "repetition_penalty": 1.0,
29
- "return_dict_in_generate": false,
30
- "target_lookbehind": 10,
31
- "temperature": 1.0,
32
- "top_k": 50,
33
- "top_p": 1.0,
34
- "transformers_version": "5.2.0",
35
- "typical_p": 1.0,
36
- "use_cache": true
37
  }
 
1
  {
 
 
 
2
  "decoder_start_token_id": 0,
3
+ "eos_token_id": 1,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  "pad_token_id": 0,
5
+ "transformers_version": "4.51.2"
 
 
 
 
 
 
 
 
 
6
  }