ninagroot commited on
Commit
4070047
·
verified ·
1 Parent(s): dfbef2e

ninagroot/babyllamatest

Browse files
Files changed (3) hide show
  1. README.md +22 -23
  2. model.safetensors +1 -1
  3. training_args.bin +1 -1
README.md CHANGED
@@ -4,7 +4,6 @@ tags:
4
  model-index:
5
  - name: Baby-Llama-58M
6
  results: []
7
- pipeline_tag: text-generation
8
  ---
9
 
10
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -14,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
16
  It achieves the following results on the evaluation set:
17
- - Loss: 3.9707
18
 
19
  ## Model description
20
 
@@ -47,26 +46,26 @@ The following hyperparameters were used during training:
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:-----:|:----:|:---------------:|
50
- | 205.7347 | 1.0 | 69 | 164.9866 |
51
- | 140.7988 | 2.0 | 138 | 105.9197 |
52
- | 69.569 | 3.0 | 207 | 46.6930 |
53
- | 28.052 | 4.0 | 276 | 19.8943 |
54
- | 14.7501 | 5.0 | 345 | 11.8347 |
55
- | 10.0078 | 6.0 | 414 | 8.8358 |
56
- | 6.8621 | 7.0 | 483 | 6.8726 |
57
- | 6.2461 | 8.0 | 552 | 6.4684 |
58
- | 5.4379 | 9.0 | 621 | 5.6002 |
59
- | 4.8584 | 10.0 | 690 | 5.3592 |
60
- | 4.652 | 11.0 | 759 | 5.0464 |
61
- | 4.2405 | 12.0 | 828 | 4.6742 |
62
- | 3.9809 | 13.0 | 897 | 4.3925 |
63
- | 3.7987 | 14.0 | 966 | 4.2740 |
64
- | 3.6593 | 15.0 | 1035 | 4.1871 |
65
- | 3.4527 | 16.0 | 1104 | 4.1033 |
66
- | 3.4028 | 17.0 | 1173 | 4.0354 |
67
- | 3.2057 | 18.0 | 1242 | 3.9949 |
68
- | 3.2595 | 19.0 | 1311 | 3.9728 |
69
- | 3.2917 | 20.0 | 1380 | 3.9707 |
70
 
71
 
72
  ### Framework versions
@@ -74,4 +73,4 @@ The following hyperparameters were used during training:
74
  - Transformers 4.37.2
75
  - Pytorch 2.1.2+cu121
76
  - Datasets 2.16.1
77
- - Tokenizers 0.15.0
 
4
  model-index:
5
  - name: Baby-Llama-58M
6
  results: []
 
7
  ---
8
 
9
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
13
 
14
  This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
+ - Loss: 3.9569
17
 
18
  ## Model description
19
 
 
46
 
47
  | Training Loss | Epoch | Step | Validation Loss |
48
  |:-------------:|:-----:|:----:|:---------------:|
49
+ | 207.4139 | 1.0 | 69 | 168.7495 |
50
+ | 140.1234 | 2.0 | 138 | 105.7544 |
51
+ | 65.5354 | 3.0 | 207 | 45.8237 |
52
+ | 25.9459 | 4.0 | 276 | 19.2743 |
53
+ | 14.1729 | 5.0 | 345 | 11.7973 |
54
+ | 9.9299 | 6.0 | 414 | 8.2180 |
55
+ | 6.8093 | 7.0 | 483 | 6.8497 |
56
+ | 6.1741 | 8.0 | 552 | 6.4197 |
57
+ | 5.4877 | 9.0 | 621 | 5.6851 |
58
+ | 4.7765 | 10.0 | 690 | 5.4365 |
59
+ | 4.6208 | 11.0 | 759 | 5.0201 |
60
+ | 4.146 | 12.0 | 828 | 4.8232 |
61
+ | 3.9427 | 13.0 | 897 | 4.4196 |
62
+ | 3.746 | 14.0 | 966 | 4.2562 |
63
+ | 3.6516 | 15.0 | 1035 | 4.1581 |
64
+ | 3.4029 | 16.0 | 1104 | 4.0782 |
65
+ | 3.3875 | 17.0 | 1173 | 4.0212 |
66
+ | 3.1863 | 18.0 | 1242 | 3.9801 |
67
+ | 3.2367 | 19.0 | 1311 | 3.9602 |
68
+ | 3.2766 | 20.0 | 1380 | 3.9569 |
69
 
70
 
71
  ### Framework versions
 
73
  - Transformers 4.37.2
74
  - Pytorch 2.1.2+cu121
75
  - Datasets 2.16.1
76
+ - Tokenizers 0.15.0
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7e635cde993d1b74d9570f588cc5fb277b168d39bbca59069c890601007f494c
3
  size 185517896
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:30f1bd31251eacb162c62b163981446fb3b2d4eaaa6053668d1b918b7b5fa5a0
3
  size 185517896
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4e2f5447f468a9cd43798831fadb6635e45910c2eb3f636c4e6469fef24b0e91
3
  size 4792
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8e1383968255addb16ba6b2f8442175b7cab5a71cfa956f933a3f957418af4f0
3
  size 4792