Weni
/

ZeroShot-3.3.2-Mistral-7b-Multilanguage-3.2.0

+---
+license: apache-2.0
+library_name: peft
+tags:
+- trl
+- sft
+- generated_from_trainer
+base_model: mistralai/Mistral-7B-Instruct-v0.2
+model-index:
+- name: ZeroShot-3.3.2-Mistral-7b-Multilanguage-3.1.0
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# ZeroShot-3.3.2-Mistral-7b-Multilanguage-3.1.0
+This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.3386
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 8
+- eval_batch_size: 2
+- seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 3
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 1.9008        | 0.03  | 50   | 1.3557          |
+| 0.724         | 0.06  | 100  | 0.5604          |
+| 0.5279        | 0.09  | 150  | 0.5183          |
+| 0.4864        | 0.12  | 200  | 0.4832          |
+| 0.4598        | 0.16  | 250  | 0.4487          |
+| 0.4286        | 0.19  | 300  | 0.4403          |
+| 0.4463        | 0.22  | 350  | 0.4362          |
+| 0.4279        | 0.25  | 400  | 0.4321          |
+| 0.4252        | 0.28  | 450  | 0.4273          |
+| 0.4214        | 0.31  | 500  | 0.4246          |
+| 0.4198        | 0.34  | 550  | 0.4209          |
+| 0.4152        | 0.37  | 600  | 0.4169          |
+| 0.4114        | 0.4   | 650  | 0.4138          |
+| 0.4197        | 0.43  | 700  | 0.4099          |
+| 0.4102        | 0.47  | 750  | 0.4081          |
+| 0.3914        | 0.5   | 800  | 0.4052          |
+| 0.4038        | 0.53  | 850  | 0.4025          |
+| 0.3941        | 0.56  | 900  | 0.4011          |
+| 0.3989        | 0.59  | 950  | 0.3990          |
+| 0.3947        | 0.62  | 1000 | 0.3968          |
+| 0.3903        | 0.65  | 1050 | 0.3954          |
+| 0.3903        | 0.68  | 1100 | 0.3931          |
+| 0.3881        | 0.71  | 1150 | 0.3922          |
+| 0.3928        | 0.74  | 1200 | 0.3901          |
+| 0.3769        | 0.78  | 1250 | 0.3880          |
+| 0.3717        | 0.81  | 1300 | 0.3860          |
+| 0.3697        | 0.84  | 1350 | 0.3851          |
+| 0.3666        | 0.87  | 1400 | 0.3834          |
+| 0.3834        | 0.9   | 1450 | 0.3815          |
+| 0.3777        | 0.93  | 1500 | 0.3801          |
+| 0.3678        | 0.96  | 1550 | 0.3779          |
+| 0.3779        | 0.99  | 1600 | 0.3777          |
+| 0.3547        | 1.02  | 1650 | 0.3764          |
+| 0.3463        | 1.05  | 1700 | 0.3749          |
+| 0.3386        | 1.09  | 1750 | 0.3739          |
+| 0.3493        | 1.12  | 1800 | 0.3737          |
+| 0.3527        | 1.15  | 1850 | 0.3717          |
+| 0.3471        | 1.18  | 1900 | 0.3712          |
+| 0.3414        | 1.21  | 1950 | 0.3704          |
+| 0.3464        | 1.24  | 2000 | 0.3683          |
+| 0.3379        | 1.27  | 2050 | 0.3682          |
+| 0.3469        | 1.3   | 2100 | 0.3665          |
+| 0.3311        | 1.33  | 2150 | 0.3659          |
+| 0.3377        | 1.36  | 2200 | 0.3644          |
+| 0.3375        | 1.4   | 2250 | 0.3629          |
+| 0.3415        | 1.43  | 2300 | 0.3619          |
+| 0.3429        | 1.46  | 2350 | 0.3607          |
+| 0.3316        | 1.49  | 2400 | 0.3607          |
+| 0.3339        | 1.52  | 2450 | 0.3588          |
+| 0.3438        | 1.55  | 2500 | 0.3581          |
+| 0.3403        | 1.58  | 2550 | 0.3572          |
+| 0.3343        | 1.61  | 2600 | 0.3555          |
+| 0.3396        | 1.64  | 2650 | 0.3545          |
+| 0.3349        | 1.67  | 2700 | 0.3537          |
+| 0.3285        | 1.71  | 2750 | 0.3527          |
+| 0.3241        | 1.74  | 2800 | 0.3518          |
+| 0.3306        | 1.77  | 2850 | 0.3512          |
+| 0.3265        | 1.8   | 2900 | 0.3499          |
+| 0.3276        | 1.83  | 2950 | 0.3491          |
+| 0.3259        | 1.86  | 3000 | 0.3486          |
+| 0.3281        | 1.89  | 3050 | 0.3477          |
+| 0.3199        | 1.92  | 3100 | 0.3470          |
+| 0.3315        | 1.95  | 3150 | 0.3457          |
+| 0.3306        | 1.98  | 3200 | 0.3455          |
+| 0.306         | 2.02  | 3250 | 0.3463          |
+| 0.2975        | 2.05  | 3300 | 0.3455          |
+| 0.2906        | 2.08  | 3350 | 0.3457          |
+| 0.2942        | 2.11  | 3400 | 0.3454          |
+| 0.2898        | 2.14  | 3450 | 0.3450          |
+| 0.299         | 2.17  | 3500 | 0.3446          |
+| 0.2913        | 2.2   | 3550 | 0.3436          |
+| 0.2891        | 2.23  | 3600 | 0.3429          |
+| 0.2875        | 2.26  | 3650 | 0.3439          |
+| 0.2838        | 2.29  | 3700 | 0.3426          |
+| 0.2944        | 2.33  | 3750 | 0.3424          |
+| 0.2904        | 2.36  | 3800 | 0.3424          |
+| 0.2926        | 2.39  | 3850 | 0.3420          |
+| 0.2992        | 2.42  | 3900 | 0.3413          |
+| 0.2834        | 2.45  | 3950 | 0.3412          |
+| 0.2923        | 2.48  | 4000 | 0.3406          |
+| 0.291         | 2.51  | 4050 | 0.3401          |
+| 0.2868        | 2.54  | 4100 | 0.3402          |
+| 0.2867        | 2.57  | 4150 | 0.3398          |
+| 0.2837        | 2.6   | 4200 | 0.3399          |
+| 0.288         | 2.64  | 4250 | 0.3393          |
+| 0.2874        | 2.67  | 4300 | 0.3393          |
+| 0.2866        | 2.7   | 4350 | 0.3392          |
+| 0.2884        | 2.73  | 4400 | 0.3390          |
+| 0.2862        | 2.76  | 4450 | 0.3389          |
+| 0.2938        | 2.79  | 4500 | 0.3389          |
+| 0.3009        | 2.82  | 4550 | 0.3387          |
+| 0.2896        | 2.85  | 4600 | 0.3387          |
+| 0.2902        | 2.88  | 4650 | 0.3386          |
+| 0.2891        | 2.91  | 4700 | 0.3386          |
+| 0.2926        | 2.95  | 4750 | 0.3386          |
+| 0.2868        | 2.98  | 4800 | 0.3386          |
+### Framework versions
+- PEFT 0.8.2
+- Transformers 4.38.0
+- Pytorch 2.1.0+cu121
+- Datasets 2.17.1
+- Tokenizers 0.15.2