--- library_name: peft base_model: AnnaelleMyriam/SFT_M3_model tags: - mcqa - question-answering - sft - lora - qwen - unsloth - generated_from_trainer model-index: - name: MNLP_M3_mcqa_sft_model results: [] --- # MNLP_M3_mcqa_sft_model This model is a fine-tuned version of [AnnaelleMyriam/SFT_M3_model](https://huggingface.co/AnnaelleMyriam/SFT_M3_model) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.5993 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 4 - eval_batch_size: 8 - seed: 42 - gradient_accumulation_steps: 2 - total_train_batch_size: 8 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: cosine - lr_scheduler_warmup_ratio: 0.05 - num_epochs: 4 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 0.3535 | 0.1352 | 250 | 0.4926 | | 0.4864 | 0.2703 | 500 | 0.3696 | | 0.342 | 0.4055 | 750 | 0.3518 | | 0.3763 | 0.5407 | 1000 | 0.3259 | | 0.3566 | 0.6759 | 1250 | 0.3335 | | 0.2901 | 0.8110 | 1500 | 0.3195 | | 0.3235 | 0.9462 | 1750 | 0.3060 | | 0.2315 | 1.0811 | 2000 | 0.3930 | | 0.2842 | 1.2163 | 2250 | 0.3920 | | 0.2183 | 1.3514 | 2500 | 0.3796 | | 0.1824 | 1.4866 | 2750 | 0.3979 | | 0.1877 | 1.6218 | 3000 | 0.4335 | | 0.1821 | 1.7570 | 3250 | 0.3981 | | 0.2364 | 1.8921 | 3500 | 0.3922 | | 0.1339 | 2.0270 | 3750 | 0.4119 | | 0.1073 | 2.1622 | 4000 | 0.5467 | | 0.0722 | 2.2974 | 4250 | 0.5596 | | 0.113 | 2.4325 | 4500 | 0.5158 | | 0.1467 | 2.5677 | 4750 | 0.4852 | | 0.1675 | 2.7029 | 5000 | 0.5103 | | 0.101 | 2.8381 | 5250 | 0.5661 | | 0.1935 | 2.9732 | 5500 | 0.4946 | | 0.1069 | 3.1081 | 5750 | 0.5844 | | 0.0799 | 3.2433 | 6000 | 0.5681 | | 0.0803 | 3.3785 | 6250 | 0.5795 | | 0.0744 | 3.5137 | 6500 | 0.5935 | | 0.0464 | 3.6488 | 6750 | 0.6010 | | 0.0643 | 3.7840 | 7000 | 0.6009 | | 0.0871 | 3.9192 | 7250 | 0.5993 | ### Framework versions - PEFT 0.15.2 - Transformers 4.52.4 - Pytorch 2.7.1+cu126 - Datasets 3.6.0 - Tokenizers 0.21.1