Model save

Files changed (4) hide show

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.2592
 ## Model description
@@ -44,17 +44,11 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch  | Step  | Validation Loss |
-|:-------------:|:------:|:-----:|:---------------:|
-| 3.8102        | 0.1067 | 5000  | 3.7427          |
-| 3.6108        | 0.2133 | 10000 | 3.5449          |
-| 3.5314        | 0.32   | 15000 | 3.4712          |
-| 3.4908        | 0.4267 | 20000 | 3.4197          |
-| 3.4619        | 0.5333 | 25000 | 3.3903          |
-| 3.4429        | 0.64   | 30000 | 3.3747          |
-| 3.4329        | 0.7467 | 35000 | 3.3571          |
-| 3.3625        | 0.8533 | 40000 | 3.2989          |
-| 3.3315        | 0.96   | 45000 | 3.2592          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.4163
 ## Model description
 ### Training results
+| Training Loss | Epoch | Step  | Validation Loss |
+|:-------------:|:-----:|:-----:|:---------------:|
+| 3.7975        | 0.32  | 5000  | 3.8163          |
+| 3.5694        | 0.64  | 10000 | 3.5999          |
+| 3.3983        | 0.96  | 15000 | 3.4163          |
 ### Framework versions

config.json CHANGED Viewed

@@ -3,13 +3,13 @@
     "NeoLLMForCausalLM"
   ],
   "attention_bias": false,
-  "attention_dropout": 0.1,
   "auto_map": {
     "AutoConfig": "configuration_neollm.NeoLLMConfig",
     "AutoModel": "modeling_neollm.NeoLLMModel",
     "AutoModelForCausalLM": "modeling_neollm.NeoLLMForCausalLM"
   },
-  "dropout_rate": 0.1,
   "dtype": "bfloat16",
   "eos_token_id": 151645,
   "fan_ratio": 0.125,

     "NeoLLMForCausalLM"
   ],
   "attention_bias": false,
+  "attention_dropout": 0.0,
   "auto_map": {
     "AutoConfig": "configuration_neollm.NeoLLMConfig",
     "AutoModel": "modeling_neollm.NeoLLMModel",
     "AutoModelForCausalLM": "modeling_neollm.NeoLLMForCausalLM"
   },
+  "dropout_rate": 0.0,
   "dtype": "bfloat16",
   "eos_token_id": 151645,
   "fan_ratio": 0.125,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8fa4ef5c1b7c936a61bd1f72c14f98a76c44cceeb1d58a06d78d7a06e067fb41
-size 245237072

 version https://git-lfs.github.com/spec/v1
+oid sha256:14a73049118eb00022b428d10b8c7d9713770f38ef65f86eda0129f09ad2156f
+size 245234560

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:26cbc8679f7064cd98348cbc6509941cbcedb48a1587673ba59f59a8521c899e
 size 5585

 version https://git-lfs.github.com/spec/v1
+oid sha256:86d71a19a4e8e2ae7a224dd51e85a427e5436faac9e034d1764e85669441181e
 size 5585