raulgdp
/

Mistral-7B-Instruct-v0.3-JEP

+---
+library_name: peft
+license: apache-2.0
+base_model: mistralai/Mistral-7B-Instruct-v0.3
+tags:
+- generated_from_trainer
+model-index:
+- name: Mistral-7B-Instruct-v0.3-JEP
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# Mistral-7B-Instruct-v0.3-JEP
+This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.9339
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 1
+- eval_batch_size: 8
+- seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 4
+- optimizer: Use paged_adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- num_epochs: 10
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 1.1764        | 0.1535 | 100  | 1.1504          |
+| 1.0487        | 0.3070 | 200  | 1.0548          |
+| 0.9853        | 0.4605 | 300  | 1.0175          |
+| 0.9844        | 0.6140 | 400  | 0.9919          |
+| 1.011         | 0.7675 | 500  | 0.9780          |
+| 0.9396        | 0.9210 | 600  | 0.9663          |
+| 0.9259        | 1.0737 | 700  | 0.9569          |
+| 0.9444        | 1.2272 | 800  | 0.9483          |
+| 0.8928        | 1.3807 | 900  | 0.9415          |
+| 0.9195        | 1.5342 | 1000 | 0.9364          |
+| 0.8967        | 1.6876 | 1100 | 0.9338          |
+| 0.927         | 1.8411 | 1200 | 0.9300          |
+| 0.9417        | 1.9946 | 1300 | 0.9263          |
+| 0.9198        | 2.1474 | 1400 | 0.9276          |
+| 0.9108        | 2.3008 | 1500 | 0.9237          |
+| 0.8971        | 2.4543 | 1600 | 0.9223          |
+| 0.8758        | 2.6078 | 1700 | 0.9199          |
+| 0.8681        | 2.7613 | 1800 | 0.9169          |
+| 0.8557        | 2.9148 | 1900 | 0.9153          |
+| 0.82          | 3.0675 | 2000 | 0.9161          |
+| 0.8379        | 3.2210 | 2100 | 0.9170          |
+| 0.8414        | 3.3745 | 2200 | 0.9161          |
+| 0.9164        | 3.5280 | 2300 | 0.9141          |
+| 0.8764        | 3.6815 | 2400 | 0.9101          |
+| 0.8449        | 3.8350 | 2500 | 0.9094          |
+| 0.8708        | 3.9885 | 2600 | 0.9088          |
+| 0.83          | 4.1412 | 2700 | 0.9132          |
+| 0.7793        | 4.2947 | 2800 | 0.9148          |
+| 0.8527        | 4.4482 | 2900 | 0.9120          |
+| 0.7941        | 4.6017 | 3000 | 0.9102          |
+| 0.8103        | 4.7552 | 3100 | 0.9111          |
+| 0.7991        | 4.9087 | 3200 | 0.9083          |
+| 0.7791        | 5.0614 | 3300 | 0.9126          |
+| 0.8297        | 5.2149 | 3400 | 0.9154          |
+| 0.739         | 5.3684 | 3500 | 0.9181          |
+| 0.8456        | 5.5219 | 3600 | 0.9105          |
+| 0.826         | 5.6754 | 3700 | 0.9135          |
+| 0.8336        | 5.8289 | 3800 | 0.9127          |
+| 0.7995        | 5.9823 | 3900 | 0.9134          |
+| 0.7782        | 6.1351 | 4000 | 0.9207          |
+| 0.7822        | 6.2886 | 4100 | 0.9170          |
+| 0.7556        | 6.4421 | 4200 | 0.9182          |
+| 0.7522        | 6.5955 | 4300 | 0.9213          |
+| 0.7669        | 6.7490 | 4400 | 0.9168          |
+| 0.7503        | 6.9025 | 4500 | 0.9173          |
+| 0.7739        | 7.0553 | 4600 | 0.9217          |
+| 0.7699        | 7.2087 | 4700 | 0.9293          |
+| 0.761         | 7.3622 | 4800 | 0.9234          |
+| 0.7257        | 7.5157 | 4900 | 0.9269          |
+| 0.7394        | 7.6692 | 5000 | 0.9233          |
+| 0.7354        | 7.8227 | 5100 | 0.9218          |
+| 0.8162        | 7.9762 | 5200 | 0.9209          |
+| 0.7276        | 8.1289 | 5300 | 0.9294          |
+| 0.7477        | 8.2824 | 5400 | 0.9299          |
+| 0.7278        | 8.4359 | 5500 | 0.9282          |
+| 0.6571        | 8.5894 | 5600 | 0.9297          |
+| 0.7494        | 8.7429 | 5700 | 0.9286          |
+| 0.767         | 8.8964 | 5800 | 0.9267          |
+| 0.6792        | 9.0491 | 5900 | 0.9338          |
+| 0.7053        | 9.2026 | 6000 | 0.9350          |
+| 0.706         | 9.3561 | 6100 | 0.9351          |
+| 0.7232        | 9.5096 | 6200 | 0.9334          |
+| 0.7301        | 9.6631 | 6300 | 0.9332          |
+| 0.7424        | 9.8166 | 6400 | 0.9344          |
+| 0.6775        | 9.9701 | 6500 | 0.9339          |
+### Framework versions
+- PEFT 0.15.2
+- Transformers 4.51.3
+- Pytorch 2.6.0+cu126
+- Datasets 3.5.0
+- Tokenizers 0.21.1

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:91102fb0b26efeda8eba936986d347dadc8778c65588727138678c354d1d78e8
 size 27297032

 version https://git-lfs.github.com/spec/v1
+oid sha256:20d7851a042f87dcac1e0a874ba5a91a5856f2c999516bbd84d9af79497714a0
 size 27297032