Weni
/

ZeroShot-3.3.6-Mistral-7b-Multilanguage-3.2.0

+---
+license: apache-2.0
+library_name: peft
+tags:
+- trl
+- sft
+- generated_from_trainer
+base_model: mistralai/Mistral-7B-v0.1
+model-index:
+- name: ZeroShot-3.3.6-Mistral-7b-Multilanguage-3.2.0
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# ZeroShot-3.3.6-Mistral-7b-Multilanguage-3.2.0
+This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.2603
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 8
+- eval_batch_size: 2
+- seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 1
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 0.8669        | 0.03  | 50   | 0.4141          |
+| 0.4025        | 0.06  | 100  | 0.4020          |
+| 0.4016        | 0.09  | 150  | 0.4082          |
+| 0.3981        | 0.12  | 200  | 0.4152          |
+| 0.4065        | 0.16  | 250  | 0.3998          |
+| 0.4003        | 0.19  | 300  | 0.3989          |
+| 0.3964        | 0.22  | 350  | 0.3991          |
+| 0.3825        | 0.25  | 400  | 0.3861          |
+| 0.379         | 0.28  | 450  | 0.3804          |
+| 0.3696        | 0.31  | 500  | 0.3756          |
+| 0.3658        | 0.34  | 550  | 0.3662          |
+| 0.3474        | 0.37  | 600  | 0.3615          |
+| 0.3575        | 0.4   | 650  | 0.3527          |
+| 0.346         | 0.43  | 700  | 0.3470          |
+| 0.3486        | 0.47  | 750  | 0.3394          |
+| 0.3326        | 0.5   | 800  | 0.3317          |
+| 0.3253        | 0.53  | 850  | 0.3228          |
+| 0.3151        | 0.56  | 900  | 0.3156          |
+| 0.3031        | 0.59  | 950  | 0.3100          |
+| 0.3106        | 0.62  | 1000 | 0.3028          |
+| 0.2994        | 0.65  | 1050 | 0.2963          |
+| 0.2974        | 0.68  | 1100 | 0.2901          |
+| 0.2742        | 0.71  | 1150 | 0.2847          |
+| 0.2873        | 0.74  | 1200 | 0.2789          |
+| 0.2694        | 0.78  | 1250 | 0.2747          |
+| 0.2738        | 0.81  | 1300 | 0.2699          |
+| 0.2719        | 0.84  | 1350 | 0.2662          |
+| 0.2525        | 0.87  | 1400 | 0.2637          |
+| 0.2538        | 0.9   | 1450 | 0.2620          |
+| 0.2576        | 0.93  | 1500 | 0.2610          |
+| 0.258         | 0.96  | 1550 | 0.2605          |
+| 0.2524        | 0.99  | 1600 | 0.2603          |
+### Framework versions
+- PEFT 0.8.2
+- Transformers 4.38.1
+- Pytorch 2.1.0+cu121
+- Datasets 2.17.1
+- Tokenizers 0.15.2