lvcalucioli commited on
Commit
cbaf053
·
verified ·
1 Parent(s): 8abb678

zephyr_outputs

Browse files
README.md CHANGED
@@ -2,6 +2,8 @@
2
  license: mit
3
  library_name: peft
4
  tags:
 
 
5
  - generated_from_trainer
6
  base_model: HuggingFaceH4/zephyr-7b-beta
7
  model-index:
@@ -15,6 +17,8 @@ should probably proofread and complete it, then remove this comment. -->
15
  # zephyr_outputs
16
 
17
  This model is a fine-tuned version of [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) on the None dataset.
 
 
18
 
19
  ## Model description
20
 
@@ -35,7 +39,7 @@ More information needed
35
  The following hyperparameters were used during training:
36
  - learning_rate: 0.0002
37
  - train_batch_size: 4
38
- - eval_batch_size: 8
39
  - seed: 42
40
  - gradient_accumulation_steps: 10
41
  - total_train_batch_size: 40
@@ -46,6 +50,18 @@ The following hyperparameters were used during training:
46
 
47
  ### Training results
48
 
 
 
 
 
 
 
 
 
 
 
 
 
49
 
50
 
51
  ### Framework versions
 
2
  license: mit
3
  library_name: peft
4
  tags:
5
+ - trl
6
+ - sft
7
  - generated_from_trainer
8
  base_model: HuggingFaceH4/zephyr-7b-beta
9
  model-index:
 
17
  # zephyr_outputs
18
 
19
  This model is a fine-tuned version of [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) on the None dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 2.0417
22
 
23
  ## Model description
24
 
 
39
  The following hyperparameters were used during training:
40
  - learning_rate: 0.0002
41
  - train_batch_size: 4
42
+ - eval_batch_size: 4
43
  - seed: 42
44
  - gradient_accumulation_steps: 10
45
  - total_train_batch_size: 40
 
50
 
51
  ### Training results
52
 
53
+ | Training Loss | Epoch | Step | Validation Loss |
54
+ |:-------------:|:-----:|:----:|:---------------:|
55
+ | 1.7481 | 1.0 | 9 | 1.4558 |
56
+ | 1.0584 | 2.0 | 18 | 1.3379 |
57
+ | 0.6062 | 3.0 | 27 | 1.4452 |
58
+ | 0.3013 | 4.0 | 36 | 1.6845 |
59
+ | 0.1519 | 5.0 | 45 | 1.6926 |
60
+ | 0.0697 | 6.0 | 54 | 1.8013 |
61
+ | 0.0356 | 7.0 | 63 | 1.9110 |
62
+ | 0.0203 | 8.0 | 72 | 1.9963 |
63
+ | 0.0152 | 9.0 | 81 | 2.0359 |
64
+ | 0.0086 | 10.0 | 90 | 2.0417 |
65
 
66
 
67
  ### Framework versions
adapter_config.json CHANGED
@@ -20,12 +20,12 @@
20
  "revision": null,
21
  "target_modules": [
22
  "k_proj",
 
 
23
  "v_proj",
24
  "down_proj",
25
- "o_proj",
26
  "gate_proj",
27
- "up_proj",
28
- "q_proj"
29
  ],
30
  "task_type": "CAUSAL_LM",
31
  "use_rslora": false
 
20
  "revision": null,
21
  "target_modules": [
22
  "k_proj",
23
+ "q_proj",
24
+ "o_proj",
25
  "v_proj",
26
  "down_proj",
 
27
  "gate_proj",
28
+ "up_proj"
 
29
  ],
30
  "task_type": "CAUSAL_LM",
31
  "use_rslora": false
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:db2d4780d476fe34f5fa2cfb8dbcb8e2b622d191c2589aabb8697798de733d54
3
  size 83945296
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2e208e168880a67f4062f39f3c1353edf46dc9a109996284ccbcb13dbc31803c
3
  size 83945296
tokenizer.json CHANGED
@@ -1,6 +1,11 @@
1
  {
2
  "version": "1.0",
3
- "truncation": null,
 
 
 
 
 
4
  "padding": null,
5
  "added_tokens": [
6
  {
 
1
  {
2
  "version": "1.0",
3
+ "truncation": {
4
+ "direction": "Left",
5
+ "max_length": 1024,
6
+ "strategy": "LongestFirst",
7
+ "stride": 0
8
+ },
9
  "padding": null,
10
  "added_tokens": [
11
  {
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2269b28e3e19060eeca41ec0d6f6e8ae69b25ede61c535a3eb485dbb46406f62
3
- size 4347
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:227073392973223d08e042f4c146fab302e58df5fc1551fcc10a7ab3649f5d7b
3
+ size 4283