prakod commited on
Commit
0a9df68
·
verified ·
1 Parent(s): 0ca79e6

Model save

Browse files
Files changed (2) hide show
  1. README.md +8 -13
  2. model.safetensors +1 -1
README.md CHANGED
@@ -17,9 +17,9 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  This model is a fine-tuned version of [ai4bharat/IndicBART](https://huggingface.co/ai4bharat/IndicBART) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 0.4771
21
- - Bleu: 30.8359
22
- - Gen Len: 14.4799
23
 
24
  ## Model description
25
 
@@ -39,11 +39,11 @@ More information needed
39
 
40
  The following hyperparameters were used during training:
41
  - learning_rate: 2e-05
42
- - train_batch_size: 8
43
- - eval_batch_size: 8
44
  - seed: 42
45
  - gradient_accumulation_steps: 4
46
- - total_train_batch_size: 32
47
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
  - num_epochs: 3
@@ -53,17 +53,12 @@ The following hyperparameters were used during training:
53
 
54
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
55
  |:-------------:|:------:|:----:|:---------------:|:-------:|:-------:|
56
- | 7.0195 | 0.4555 | 500 | 3.2801 | 33.8167 | 17.7783 |
57
- | 1.8495 | 0.9110 | 1000 | 0.9274 | 37.2318 | 14.4144 |
58
- | 1.2111 | 1.3671 | 1500 | 0.5053 | 31.1943 | 14.467 |
59
- | 0.8509 | 1.8226 | 2000 | 0.4772 | 30.8288 | 14.48 |
60
- | 0.844 | 2.2788 | 2500 | 0.4771 | 30.8365 | 14.4778 |
61
- | 0.8527 | 2.7342 | 3000 | 0.4771 | 30.8359 | 14.4799 |
62
 
63
 
64
  ### Framework versions
65
 
66
  - Transformers 4.51.3
67
  - Pytorch 2.6.0+cu124
68
- - Datasets 2.14.4
69
  - Tokenizers 0.21.1
 
17
 
18
  This model is a fine-tuned version of [ai4bharat/IndicBART](https://huggingface.co/ai4bharat/IndicBART) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 1.1507
21
+ - Bleu: 31.3951
22
+ - Gen Len: 14.701
23
 
24
  ## Model description
25
 
 
39
 
40
  The following hyperparameters were used during training:
41
  - learning_rate: 2e-05
42
+ - train_batch_size: 16
43
+ - eval_batch_size: 16
44
  - seed: 42
45
  - gradient_accumulation_steps: 4
46
+ - total_train_batch_size: 64
47
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
  - num_epochs: 3
 
53
 
54
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
55
  |:-------------:|:------:|:----:|:---------------:|:-------:|:-------:|
56
+ | 2.6832 | 1.9945 | 1000 | 1.1507 | 31.3951 | 14.701 |
 
 
 
 
 
57
 
58
 
59
  ### Framework versions
60
 
61
  - Transformers 4.51.3
62
  - Pytorch 2.6.0+cu124
63
+ - Datasets 3.6.0
64
  - Tokenizers 0.21.1
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4d4ff43f674e6d76ef9b7f168e35609577e09c573b6db764fb08969a43dc1efd
3
  size 976355336
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4f559e18f3d9644e1753317a1525a7560c6d15aaa15afac3e857fabcba0da62f
3
  size 976355336