sophiargh
/

MNLP_M3_mcqa_model_v2

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

sophiargh commited on Jun 9, 2025

Commit

7fcaf71

·

verified ·

1 Parent(s): 2869276

End of training

Files changed (2) hide show

README.md +15 -12
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -14,9 +14,9 @@ should probably proofread and complete it, then remove this comment. -->
 # MNLP_M3_mcqa_model
-This model is a fine-tuned version of [Qwen/Qwen3-0.6B-Base](https://huggingface.co/Qwen/Qwen3-0.6B-Base) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0028
 ## Model description
@@ -35,24 +35,27 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 5e-05
-- train_batch_size: 4
-- eval_batch_size: 4
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 16
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
-- lr_scheduler_warmup_ratio: 0.1
 - num_epochs: 4
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 0.0019        | 1.0   | 2196 | 0.0018          |
-| 0.0011        | 2.0   | 4392 | 0.0018          |
-| 0.0003        | 3.0   | 6588 | 0.0028          |
 ### Framework versions

 # MNLP_M3_mcqa_model
+This model is a fine-tuned version of [Qwen/Qwen3-0.6B-Base](https://huggingface.co/Qwen/Qwen3-0.6B-Base) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.2439
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 1e-05
+- train_batch_size: 2
+- eval_batch_size: 2
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 8
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.01
 - num_epochs: 4
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 0.2604        | 0.0649 | 1000 | 0.2616          |
+| 0.2425        | 0.1299 | 2000 | 0.2582          |
+| 0.2335        | 0.1948 | 3000 | 0.2510          |
+| 0.2202        | 0.2598 | 4000 | 0.2430          |
+| 0.2164        | 0.3247 | 5000 | 0.2459          |
+| 0.2072        | 0.3897 | 6000 | 0.2439          |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8de2862e84ecf33c3d493b8e4001dbfbe454f101dda137b481388d60eb290120
 size 1192135096

 version https://git-lfs.github.com/spec/v1
+oid sha256:a3b5cb9880b87b94df5d538e1344ee53825103c080c8cf0273fd4cbadd050a16
 size 1192135096