Model save

Browse files

Files changed (3) hide show

README.md +76 -0
adapter_model.safetensors +1 -1
runs/Mar26_03-24-06_llm-a100-40/events.out.tfevents.1711423460.llm-a100-40.4993.0 +2 -2

README.md ADDED Viewed

	@@ -0,0 +1,76 @@

+---
+license: apache-2.0
+library_name: peft
+tags:
+- trl
+- sft
+- generated_from_trainer
+datasets:
+- generator
+base_model: mistralai/Mixtral-8x7B-v0.1
+model-index:
+- name: mixtral_id
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# mixtral_id
+This model is a fine-tuned version of [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) on the generator dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.7745
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 2
+- eval_batch_size: 1
+- seed: 42
+- gradient_accumulation_steps: 32
+- total_train_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.03
+- training_steps: 230
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 0.9422        | 0.15  | 20   | 0.8942          |
+| 0.8488        | 0.3   | 40   | 0.8458          |
+| 0.8208        | 0.46  | 60   | 0.8204          |
+| 0.7981        | 0.61  | 80   | 0.8045          |
+| 0.7912        | 0.76  | 100  | 0.7936          |
+| 0.7789        | 0.91  | 120  | 0.7852          |
+| 0.7289        | 1.07  | 140  | 0.7810          |
+| 0.7277        | 1.22  | 160  | 0.7780          |
+| 0.7112        | 1.37  | 180  | 0.7758          |
+| 0.7142        | 1.52  | 200  | 0.7747          |
+| 0.7222        | 1.68  | 220  | 0.7745          |
+### Framework versions
+- PEFT 0.7.2.dev0
+- Transformers 4.38.1
+- Pytorch 2.1.2+cu121
+- Datasets 2.16.1
+- Tokenizers 0.15.0

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:49006811aca68461a2a7bddb48cea6b4b28c43ec676c1967c53a8dac2d1e0721
 size 1938079368

 version https://git-lfs.github.com/spec/v1
+oid sha256:9c27edae5971ebbc4e1a0301e7d66628750bc5cf0c6c5ca4a059ccdb0f156223
 size 1938079368

runs/Mar26_03-24-06_llm-a100-40/events.out.tfevents.1711423460.llm-a100-40.4993.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e0134d7b3dda78c0fdb8b31443d7e34a2e8615b81a8bf9c9e7911b1dc054cdb9
-size 11958

 version https://git-lfs.github.com/spec/v1
+oid sha256:413e2ce5fedf9eb5af9f0cb6f96200d4ca9a14519364fceeab9f85bcfc069fe2
+size 13216