ninagroot/Llama-360M

Browse files

Files changed (4) hide show

README.md +20 -38
model.safetensors +1 -1
runs/Apr22_21-01-14_gcn12.local.snellius.surf.nl/events.out.tfevents.1713812489.gcn12.local.snellius.surf.nl.3550372.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 4.9792
 ## Model description
@@ -41,49 +41,31 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 50
-- num_epochs: 40
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 8.5497        | 0.89  | 2    | 8.5052          |
-| 8.1931        | 1.78  | 4    | 8.1997          |
-| 7.6129        | 2.67  | 6    | 7.7377          |
-| 6.8911        | 4.0   | 9    | 7.1268          |
-| 6.5051        | 4.89  | 11   | 6.8545          |
-| 6.2169        | 5.78  | 13   | 6.6684          |
-| 5.9195        | 6.67  | 15   | 6.4750          |
-| 5.5126        | 8.0   | 18   | 6.1687          |
-| 5.3062        | 8.89  | 20   | 6.0107          |
-| 4.9111        | 9.78  | 22   | 5.8687          |
-| 4.7431        | 10.67 | 24   | 5.6626          |
-| 4.2334        | 12.0  | 27   | 5.4330          |
-| 4.0508        | 12.89 | 29   | 5.1847          |
-| 3.7627        | 13.78 | 31   | 5.0201          |
-| 3.4304        | 14.67 | 33   | 4.9383          |
-| 3.101         | 16.0  | 36   | 4.9051          |
-| 2.889         | 16.89 | 38   | 4.8583          |
-| 2.5785        | 17.78 | 40   | 4.8899          |
-| 2.2328        | 18.67 | 42   | 4.8363          |
-| 1.875         | 20.0  | 45   | 4.8183          |
-| 1.5514        | 20.89 | 47   | 4.8539          |
-| 1.1894        | 21.78 | 49   | 4.9134          |
-| 0.9224        | 22.67 | 51   | 4.8557          |
-| 0.6526        | 24.0  | 54   | 4.9283          |
-| 0.49          | 24.89 | 56   | 5.0074          |
-| 0.3634        | 25.78 | 58   | 4.9522          |
-| 0.2674        | 26.67 | 60   | 5.0044          |
-| 0.2052        | 28.0  | 63   | 4.9431          |
-| 0.166         | 28.89 | 65   | 4.9770          |
-| 0.1323        | 29.78 | 67   | 4.9768          |
-| 0.1059        | 30.67 | 69   | 4.9760          |
-| 0.0858        | 32.0  | 72   | 4.9849          |
-| 0.079         | 32.89 | 74   | 4.9828          |
-| 0.0687        | 33.78 | 76   | 4.9798          |
-| 0.0636        | 34.67 | 78   | 4.9792          |
-| 0.061         | 35.56 | 80   | 4.9792          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.8426
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 50
+- num_epochs: 20
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 8.5997        | 0.89  | 2    | 8.5409          |
+| 8.2818        | 1.78  | 4    | 8.2523          |
+| 7.6582        | 2.67  | 6    | 7.8134          |
+| 6.881         | 4.0   | 9    | 7.1270          |
+| 6.4599        | 4.89  | 11   | 6.8434          |
+| 6.1818        | 5.78  | 13   | 6.6395          |
+| 5.8836        | 6.67  | 15   | 6.4502          |
+| 5.5042        | 8.0   | 18   | 6.1589          |
+| 5.2565        | 8.89  | 20   | 5.9815          |
+| 4.8638        | 9.78  | 22   | 5.8434          |
+| 4.6811        | 10.67 | 24   | 5.6290          |
+| 4.162         | 12.0  | 27   | 5.3371          |
+| 3.9392        | 12.89 | 29   | 5.1585          |
+| 3.6738        | 13.78 | 31   | 5.0040          |
+| 3.3264        | 14.67 | 33   | 4.9210          |
+| 2.9917        | 16.0  | 36   | 4.8846          |
+| 2.7623        | 16.89 | 38   | 4.8758          |
+| 2.478         | 17.78 | 40   | 4.8426          |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:218c35ebb9d82f55932847541e4948912e899accf4fb246b2ee81d3d446c15c4
 size 1344213240

 version https://git-lfs.github.com/spec/v1
+oid sha256:11db9296555813830fcd05a0b6549211fa3a57ce1671fdbf1e01cc612e245a51
 size 1344213240

runs/Apr22_21-01-14_gcn12.local.snellius.surf.nl/events.out.tfevents.1713812489.gcn12.local.snellius.surf.nl.3550372.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e797e87a2ec2ae43df99a86d787baf80cac35b12b8467f5ab1200b43c710f342
+size 17977

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1e3362fa68c5ab8d42f5ff87b9d3459feb3cbd9664fec82296692784cc7b67c7
 size 4984

 version https://git-lfs.github.com/spec/v1
+oid sha256:f85b83857ad58d44e98a7d2bf3c2e0558f37a2fdf1a6147d234cce95d1982e17
 size 4984