renqiux0302 commited on
Commit
93731db
·
verified ·
1 Parent(s): 1c473ef

Delete auxiliary_decoder

Browse files
auxiliary_decoder/README.md DELETED
@@ -1,21 +0,0 @@
1
- ---
2
- library_name: peft
3
- ---
4
- ## Training procedure
5
-
6
-
7
- The following `bitsandbytes` quantization config was used during training:
8
- - quant_method: bitsandbytes
9
- - load_in_8bit: True
10
- - load_in_4bit: False
11
- - llm_int8_threshold: 6.0
12
- - llm_int8_skip_modules: None
13
- - llm_int8_enable_fp32_cpu_offload: False
14
- - llm_int8_has_fp16_weight: False
15
- - bnb_4bit_quant_type: fp4
16
- - bnb_4bit_use_double_quant: False
17
- - bnb_4bit_compute_dtype: float32
18
- ### Framework versions
19
-
20
-
21
- - PEFT 0.4.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
auxiliary_decoder/adapter_config.json DELETED
@@ -1,21 +0,0 @@
1
- {
2
- "auto_mapping": null,
3
- "base_model_name_or_path": "/cpfs01/shared/ADLab/hug_ckpts/vicuna-7b-v1.5",
4
- "bias": "none",
5
- "fan_in_fan_out": false,
6
- "inference_mode": true,
7
- "init_lora_weights": true,
8
- "layers_pattern": null,
9
- "layers_to_transform": null,
10
- "lora_alpha": 32,
11
- "lora_dropout": 0.05,
12
- "modules_to_save": null,
13
- "peft_type": "LORA",
14
- "r": 16,
15
- "revision": null,
16
- "target_modules": [
17
- "q_proj",
18
- "v_proj"
19
- ],
20
- "task_type": "CAUSAL_LM"
21
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
auxiliary_decoder/adapter_model.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:e5e1621f48d9ad8feb1d6d31050275f0aafd080c5c07153301fe2f48411f4406
3
- size 443
 
 
 
 
auxiliary_decoder/base/README.md DELETED
@@ -1,48 +0,0 @@
1
- ---
2
- inference: false
3
- license: llama2
4
- ---
5
-
6
- # Vicuna Model Card
7
-
8
- ## Model Details
9
-
10
- Vicuna is a chat assistant trained by fine-tuning Llama 2 on user-shared conversations collected from ShareGPT.
11
-
12
- - **Developed by:** [LMSYS](https://lmsys.org/)
13
- - **Model type:** An auto-regressive language model based on the transformer architecture
14
- - **License:** Llama 2 Community License Agreement
15
- - **Finetuned from model:** [Llama 2](https://arxiv.org/abs/2307.09288)
16
-
17
- ### Model Sources
18
-
19
- - **Repository:** https://github.com/lm-sys/FastChat
20
- - **Blog:** https://lmsys.org/blog/2023-03-30-vicuna/
21
- - **Paper:** https://arxiv.org/abs/2306.05685
22
- - **Demo:** https://chat.lmsys.org/
23
-
24
- ## Uses
25
-
26
- The primary use of Vicuna is research on large language models and chatbots.
27
- The primary intended users of the model are researchers and hobbyists in natural language processing, machine learning, and artificial intelligence.
28
-
29
- ## How to Get Started with the Model
30
-
31
- - Command line interface: https://github.com/lm-sys/FastChat#vicuna-weights
32
- - APIs (OpenAI API, Huggingface API): https://github.com/lm-sys/FastChat/tree/main#api
33
-
34
- ## Training Details
35
-
36
- Vicuna v1.5 is fine-tuned from Llama 2 with supervised instruction fine-tuning.
37
- The training data is around 125K conversations collected from ShareGPT.com.
38
- See more details in the "Training Details of Vicuna Models" section in the appendix of this [paper](https://arxiv.org/pdf/2306.05685.pdf).
39
-
40
- ## Evaluation
41
-
42
- ![Evaluation Results](https://github.com/lm-sys/lm-sys.github.io/blob/main/public/images/webdata/vicuna_v1.5_eval.png?raw=true)
43
-
44
- Vicuna is evaluated with standard benchmarks, human preference, and LLM-as-a-judge. See more details in this [paper](https://arxiv.org/pdf/2306.05685.pdf) and [leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard).
45
-
46
- ## Difference between different versions of Vicuna
47
-
48
- See [vicuna_weights_version.md](https://github.com/lm-sys/FastChat/blob/main/docs/vicuna_weights_version.md)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
auxiliary_decoder/base/config.json DELETED
@@ -1,26 +0,0 @@
1
- {
2
- "_name_or_path": "vicuna-7b-v1.5",
3
- "architectures": [
4
- "LlamaForCausalLM"
5
- ],
6
- "bos_token_id": 1,
7
- "eos_token_id": 2,
8
- "hidden_act": "silu",
9
- "hidden_size": 4096,
10
- "initializer_range": 0.02,
11
- "intermediate_size": 11008,
12
- "max_position_embeddings": 4096,
13
- "model_type": "llama",
14
- "num_attention_heads": 32,
15
- "num_hidden_layers": 32,
16
- "num_key_value_heads": 32,
17
- "pad_token_id": 0,
18
- "pretraining_tp": 1,
19
- "rms_norm_eps": 1e-05,
20
- "rope_scaling": null,
21
- "tie_word_embeddings": false,
22
- "torch_dtype": "float16",
23
- "transformers_version": "4.31.0",
24
- "use_cache": true,
25
- "vocab_size": 32000
26
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
auxiliary_decoder/base/generation_config.json DELETED
@@ -1,9 +0,0 @@
1
- {
2
- "bos_token_id": 1,
3
- "eos_token_id": 2,
4
- "max_length": 4096,
5
- "pad_token_id": 0,
6
- "temperature": 0.9,
7
- "top_p": 0.6,
8
- "transformers_version": "4.31.0"
9
- }
 
 
 
 
 
 
 
 
 
 
auxiliary_decoder/base/gitattributes DELETED
@@ -1,35 +0,0 @@
1
- *.7z filter=lfs diff=lfs merge=lfs -text
2
- *.arrow filter=lfs diff=lfs merge=lfs -text
3
- *.bin filter=lfs diff=lfs merge=lfs -text
4
- *.bz2 filter=lfs diff=lfs merge=lfs -text
5
- *.ckpt filter=lfs diff=lfs merge=lfs -text
6
- *.ftz filter=lfs diff=lfs merge=lfs -text
7
- *.gz filter=lfs diff=lfs merge=lfs -text
8
- *.h5 filter=lfs diff=lfs merge=lfs -text
9
- *.joblib filter=lfs diff=lfs merge=lfs -text
10
- *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
- *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
- *.model filter=lfs diff=lfs merge=lfs -text
13
- *.msgpack filter=lfs diff=lfs merge=lfs -text
14
- *.npy filter=lfs diff=lfs merge=lfs -text
15
- *.npz filter=lfs diff=lfs merge=lfs -text
16
- *.onnx filter=lfs diff=lfs merge=lfs -text
17
- *.ot filter=lfs diff=lfs merge=lfs -text
18
- *.parquet filter=lfs diff=lfs merge=lfs -text
19
- *.pb filter=lfs diff=lfs merge=lfs -text
20
- *.pickle filter=lfs diff=lfs merge=lfs -text
21
- *.pkl filter=lfs diff=lfs merge=lfs -text
22
- *.pt filter=lfs diff=lfs merge=lfs -text
23
- *.pth filter=lfs diff=lfs merge=lfs -text
24
- *.rar filter=lfs diff=lfs merge=lfs -text
25
- *.safetensors filter=lfs diff=lfs merge=lfs -text
26
- saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
- *.tar.* filter=lfs diff=lfs merge=lfs -text
28
- *.tar filter=lfs diff=lfs merge=lfs -text
29
- *.tflite filter=lfs diff=lfs merge=lfs -text
30
- *.tgz filter=lfs diff=lfs merge=lfs -text
31
- *.wasm filter=lfs diff=lfs merge=lfs -text
32
- *.xz filter=lfs diff=lfs merge=lfs -text
33
- *.zip filter=lfs diff=lfs merge=lfs -text
34
- *.zst filter=lfs diff=lfs merge=lfs -text
35
- *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
auxiliary_decoder/base/pytorch_model-00001-of-00002.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:4133d2fcc5f31286881ea50806d95b721d016b533036a99dedce3f8fe88520e6
3
- size 9976634558
 
 
 
 
auxiliary_decoder/base/pytorch_model-00002-of-00002.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:d261d3c35e92d3070d1e61ed821ebfca812a847d2a880757d82728acf005c5ac
3
- size 3500315539
 
 
 
 
auxiliary_decoder/base/pytorch_model.bin.index.json DELETED
@@ -1,330 +0,0 @@
1
- {
2
- "metadata": {
3
- "total_size": 13476839424
4
- },
5
- "weight_map": {
6
- "lm_head.weight": "pytorch_model-00002-of-00002.bin",
7
- "model.embed_tokens.weight": "pytorch_model-00001-of-00002.bin",
8
- "model.layers.0.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
9
- "model.layers.0.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
10
- "model.layers.0.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
11
- "model.layers.0.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
12
- "model.layers.0.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
13
- "model.layers.0.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
14
- "model.layers.0.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
15
- "model.layers.0.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
16
- "model.layers.0.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
17
- "model.layers.0.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
18
- "model.layers.1.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
19
- "model.layers.1.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
20
- "model.layers.1.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
21
- "model.layers.1.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
22
- "model.layers.1.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
23
- "model.layers.1.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
24
- "model.layers.1.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
25
- "model.layers.1.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
26
- "model.layers.1.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
27
- "model.layers.1.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
28
- "model.layers.10.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
29
- "model.layers.10.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
30
- "model.layers.10.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
31
- "model.layers.10.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
32
- "model.layers.10.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
33
- "model.layers.10.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
34
- "model.layers.10.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
35
- "model.layers.10.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
36
- "model.layers.10.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
37
- "model.layers.10.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
38
- "model.layers.11.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
39
- "model.layers.11.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
40
- "model.layers.11.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
41
- "model.layers.11.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
42
- "model.layers.11.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
43
- "model.layers.11.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
44
- "model.layers.11.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
45
- "model.layers.11.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
46
- "model.layers.11.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
47
- "model.layers.11.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
48
- "model.layers.12.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
49
- "model.layers.12.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
50
- "model.layers.12.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
51
- "model.layers.12.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
52
- "model.layers.12.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
53
- "model.layers.12.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
54
- "model.layers.12.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
55
- "model.layers.12.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
56
- "model.layers.12.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
57
- "model.layers.12.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
58
- "model.layers.13.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
59
- "model.layers.13.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
60
- "model.layers.13.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
61
- "model.layers.13.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
62
- "model.layers.13.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
63
- "model.layers.13.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
64
- "model.layers.13.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
65
- "model.layers.13.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
66
- "model.layers.13.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
67
- "model.layers.13.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
68
- "model.layers.14.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
69
- "model.layers.14.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
70
- "model.layers.14.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
71
- "model.layers.14.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
72
- "model.layers.14.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
73
- "model.layers.14.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
74
- "model.layers.14.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
75
- "model.layers.14.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
76
- "model.layers.14.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
77
- "model.layers.14.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
78
- "model.layers.15.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
79
- "model.layers.15.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
80
- "model.layers.15.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
81
- "model.layers.15.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
82
- "model.layers.15.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
83
- "model.layers.15.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
84
- "model.layers.15.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
85
- "model.layers.15.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
86
- "model.layers.15.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
87
- "model.layers.15.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
88
- "model.layers.16.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
89
- "model.layers.16.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
90
- "model.layers.16.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
91
- "model.layers.16.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
92
- "model.layers.16.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
93
- "model.layers.16.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
94
- "model.layers.16.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
95
- "model.layers.16.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
96
- "model.layers.16.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
97
- "model.layers.16.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
98
- "model.layers.17.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
99
- "model.layers.17.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
100
- "model.layers.17.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
101
- "model.layers.17.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
102
- "model.layers.17.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
103
- "model.layers.17.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
104
- "model.layers.17.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
105
- "model.layers.17.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
106
- "model.layers.17.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
107
- "model.layers.17.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
108
- "model.layers.18.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
109
- "model.layers.18.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
110
- "model.layers.18.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
111
- "model.layers.18.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
112
- "model.layers.18.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
113
- "model.layers.18.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
114
- "model.layers.18.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
115
- "model.layers.18.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
116
- "model.layers.18.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
117
- "model.layers.18.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
118
- "model.layers.19.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
119
- "model.layers.19.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
120
- "model.layers.19.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
121
- "model.layers.19.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
122
- "model.layers.19.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
123
- "model.layers.19.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
124
- "model.layers.19.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
125
- "model.layers.19.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
126
- "model.layers.19.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
127
- "model.layers.19.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
128
- "model.layers.2.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
129
- "model.layers.2.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
130
- "model.layers.2.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
131
- "model.layers.2.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
132
- "model.layers.2.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
133
- "model.layers.2.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
134
- "model.layers.2.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
135
- "model.layers.2.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
136
- "model.layers.2.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
137
- "model.layers.2.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
138
- "model.layers.20.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
139
- "model.layers.20.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
140
- "model.layers.20.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
141
- "model.layers.20.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
142
- "model.layers.20.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
143
- "model.layers.20.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
144
- "model.layers.20.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
145
- "model.layers.20.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
146
- "model.layers.20.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
147
- "model.layers.20.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
148
- "model.layers.21.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
149
- "model.layers.21.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
150
- "model.layers.21.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
151
- "model.layers.21.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
152
- "model.layers.21.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
153
- "model.layers.21.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
154
- "model.layers.21.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
155
- "model.layers.21.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
156
- "model.layers.21.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
157
- "model.layers.21.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
158
- "model.layers.22.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
159
- "model.layers.22.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
160
- "model.layers.22.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
161
- "model.layers.22.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
162
- "model.layers.22.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
163
- "model.layers.22.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
164
- "model.layers.22.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
165
- "model.layers.22.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
166
- "model.layers.22.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
167
- "model.layers.22.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
168
- "model.layers.23.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
169
- "model.layers.23.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
170
- "model.layers.23.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
171
- "model.layers.23.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
172
- "model.layers.23.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
173
- "model.layers.23.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
174
- "model.layers.23.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
175
- "model.layers.23.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
176
- "model.layers.23.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
177
- "model.layers.23.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
178
- "model.layers.24.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
179
- "model.layers.24.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
180
- "model.layers.24.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
181
- "model.layers.24.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
182
- "model.layers.24.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
183
- "model.layers.24.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
184
- "model.layers.24.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
185
- "model.layers.24.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
186
- "model.layers.24.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
187
- "model.layers.24.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
188
- "model.layers.25.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
189
- "model.layers.25.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
190
- "model.layers.25.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
191
- "model.layers.25.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
192
- "model.layers.25.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
193
- "model.layers.25.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
194
- "model.layers.25.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
195
- "model.layers.25.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
196
- "model.layers.25.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
197
- "model.layers.25.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
198
- "model.layers.26.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
199
- "model.layers.26.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
200
- "model.layers.26.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
201
- "model.layers.26.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
202
- "model.layers.26.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
203
- "model.layers.26.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
204
- "model.layers.26.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
205
- "model.layers.26.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
206
- "model.layers.26.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
207
- "model.layers.26.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
208
- "model.layers.27.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
209
- "model.layers.27.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
210
- "model.layers.27.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
211
- "model.layers.27.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
212
- "model.layers.27.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
213
- "model.layers.27.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
214
- "model.layers.27.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
215
- "model.layers.27.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
216
- "model.layers.27.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
217
- "model.layers.27.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
218
- "model.layers.28.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
219
- "model.layers.28.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
220
- "model.layers.28.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
221
- "model.layers.28.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
222
- "model.layers.28.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
223
- "model.layers.28.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
224
- "model.layers.28.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
225
- "model.layers.28.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
226
- "model.layers.28.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
227
- "model.layers.28.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
228
- "model.layers.29.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
229
- "model.layers.29.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
230
- "model.layers.29.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
231
- "model.layers.29.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
232
- "model.layers.29.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
233
- "model.layers.29.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
234
- "model.layers.29.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
235
- "model.layers.29.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
236
- "model.layers.29.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
237
- "model.layers.29.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
238
- "model.layers.3.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
239
- "model.layers.3.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
240
- "model.layers.3.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
241
- "model.layers.3.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
242
- "model.layers.3.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
243
- "model.layers.3.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
244
- "model.layers.3.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
245
- "model.layers.3.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
246
- "model.layers.3.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
247
- "model.layers.3.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
248
- "model.layers.30.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
249
- "model.layers.30.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
250
- "model.layers.30.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
251
- "model.layers.30.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
252
- "model.layers.30.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
253
- "model.layers.30.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
254
- "model.layers.30.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
255
- "model.layers.30.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
256
- "model.layers.30.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
257
- "model.layers.30.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
258
- "model.layers.31.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
259
- "model.layers.31.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
260
- "model.layers.31.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
261
- "model.layers.31.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
262
- "model.layers.31.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
263
- "model.layers.31.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
264
- "model.layers.31.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
265
- "model.layers.31.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
266
- "model.layers.31.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
267
- "model.layers.31.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
268
- "model.layers.4.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
269
- "model.layers.4.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
270
- "model.layers.4.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
271
- "model.layers.4.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
272
- "model.layers.4.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
273
- "model.layers.4.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
274
- "model.layers.4.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
275
- "model.layers.4.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
276
- "model.layers.4.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
277
- "model.layers.4.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
278
- "model.layers.5.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
279
- "model.layers.5.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
280
- "model.layers.5.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
281
- "model.layers.5.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
282
- "model.layers.5.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
283
- "model.layers.5.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
284
- "model.layers.5.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
285
- "model.layers.5.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
286
- "model.layers.5.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
287
- "model.layers.5.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
288
- "model.layers.6.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
289
- "model.layers.6.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
290
- "model.layers.6.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
291
- "model.layers.6.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
292
- "model.layers.6.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
293
- "model.layers.6.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
294
- "model.layers.6.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
295
- "model.layers.6.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
296
- "model.layers.6.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
297
- "model.layers.6.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
298
- "model.layers.7.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
299
- "model.layers.7.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
300
- "model.layers.7.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
301
- "model.layers.7.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
302
- "model.layers.7.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
303
- "model.layers.7.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
304
- "model.layers.7.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
305
- "model.layers.7.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
306
- "model.layers.7.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
307
- "model.layers.7.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
308
- "model.layers.8.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
309
- "model.layers.8.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
310
- "model.layers.8.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
311
- "model.layers.8.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
312
- "model.layers.8.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
313
- "model.layers.8.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
314
- "model.layers.8.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
315
- "model.layers.8.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
316
- "model.layers.8.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
317
- "model.layers.8.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
318
- "model.layers.9.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
319
- "model.layers.9.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
320
- "model.layers.9.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
321
- "model.layers.9.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
322
- "model.layers.9.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
323
- "model.layers.9.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
324
- "model.layers.9.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
325
- "model.layers.9.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
326
- "model.layers.9.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
327
- "model.layers.9.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
328
- "model.norm.weight": "pytorch_model-00002-of-00002.bin"
329
- }
330
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
auxiliary_decoder/base/special_tokens_map.json DELETED
@@ -1,24 +0,0 @@
1
- {
2
- "bos_token": {
3
- "content": "<s>",
4
- "lstrip": false,
5
- "normalized": false,
6
- "rstrip": false,
7
- "single_word": false
8
- },
9
- "eos_token": {
10
- "content": "</s>",
11
- "lstrip": false,
12
- "normalized": false,
13
- "rstrip": false,
14
- "single_word": false
15
- },
16
- "pad_token": "<unk>",
17
- "unk_token": {
18
- "content": "<unk>",
19
- "lstrip": false,
20
- "normalized": false,
21
- "rstrip": false,
22
- "single_word": false
23
- }
24
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
auxiliary_decoder/base/tokenizer.model DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
3
- size 499723
 
 
 
 
auxiliary_decoder/base/tokenizer_config.json DELETED
@@ -1,35 +0,0 @@
1
- {
2
- "add_bos_token": true,
3
- "add_eos_token": false,
4
- "bos_token": {
5
- "__type": "AddedToken",
6
- "content": "<s>",
7
- "lstrip": false,
8
- "normalized": false,
9
- "rstrip": false,
10
- "single_word": false
11
- },
12
- "clean_up_tokenization_spaces": false,
13
- "eos_token": {
14
- "__type": "AddedToken",
15
- "content": "</s>",
16
- "lstrip": false,
17
- "normalized": false,
18
- "rstrip": false,
19
- "single_word": false
20
- },
21
- "legacy": false,
22
- "model_max_length": 4096,
23
- "pad_token": null,
24
- "padding_side": "right",
25
- "sp_model_kwargs": {},
26
- "tokenizer_class": "LlamaTokenizer",
27
- "unk_token": {
28
- "__type": "AddedToken",
29
- "content": "<unk>",
30
- "lstrip": false,
31
- "normalized": false,
32
- "rstrip": false,
33
- "single_word": false
34
- }
35
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
auxiliary_decoder/optimizer.pt DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:cbcdd53f217ad937d0347e62c442748acb981f5b3e963284f09bd1317f16546f
3
- size 67216517
 
 
 
 
auxiliary_decoder/rng_state_0.pth DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:99929bd03b48e29ef637de6ede9dd8f967124847190195d30b291479474bd0b6
3
- size 21687
 
 
 
 
auxiliary_decoder/rng_state_1.pth DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:ba9110abd238d877fe40302d3a99ab2a07d0a08a3046bcf8659666a97cf3c4bd
3
- size 21687
 
 
 
 
auxiliary_decoder/rng_state_2.pth DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:c4691041a42ddadbc0735b74ddb085e04a5e4636f3f3a625dff055f67734a886
3
- size 21687
 
 
 
 
auxiliary_decoder/rng_state_3.pth DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:93211ed628a3ffb6234f03a6de1cacf8d171d9c8f5ba068610b72fcf413643e7
3
- size 21687
 
 
 
 
auxiliary_decoder/rng_state_4.pth DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:44ee2aab0f3ad1ccb9a537bac135dd2b5a88a9c5bfb76d04070e78438ba8eb8e
3
- size 21687
 
 
 
 
auxiliary_decoder/rng_state_5.pth DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:6c3524f72d69da29ddc7e1b7d2020eaf3c17fdb7760b26b572369cd0febad4de
3
- size 21687
 
 
 
 
auxiliary_decoder/rng_state_6.pth DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:aec123ea87d4cc11d173df7c3aee5d70636a30ffaf7da927e3c3bde634702b9d
3
- size 21687
 
 
 
 
auxiliary_decoder/rng_state_7.pth DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:eb9f4d99b6fdf30377370b405acacb904ee76b422c17d47ee47f0e09827fb2af
3
- size 21687
 
 
 
 
auxiliary_decoder/scheduler.pt DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:9c767e531f5cc92dcf2b7622466a92b93376b191b8847a14de57d9649808db98
3
- size 627
 
 
 
 
auxiliary_decoder/trainer_state.json DELETED
@@ -1,787 +0,0 @@
1
- {
2
- "best_metric": 0.3883955776691437,
3
- "best_model_checkpoint": "exp/vicuna-7b-lora-sft-code_qa_desc_summ_triplet_r_16_alpha_32_8GPUs-0116/checkpoint-1200",
4
- "epoch": 4.375569735642662,
5
- "eval_steps": 200,
6
- "global_step": 1200,
7
- "is_hyper_param_search": false,
8
- "is_local_process_zero": true,
9
- "is_world_process_zero": true,
10
- "log_history": [
11
- {
12
- "epoch": 0.04,
13
- "learning_rate": 2.9999999999999997e-05,
14
- "loss": 1.4343,
15
- "step": 10
16
- },
17
- {
18
- "epoch": 0.07,
19
- "learning_rate": 5.9999999999999995e-05,
20
- "loss": 1.4848,
21
- "step": 20
22
- },
23
- {
24
- "epoch": 0.11,
25
- "learning_rate": 8.999999999999999e-05,
26
- "loss": 1.1941,
27
- "step": 30
28
- },
29
- {
30
- "epoch": 0.15,
31
- "learning_rate": 0.00011999999999999999,
32
- "loss": 0.8226,
33
- "step": 40
34
- },
35
- {
36
- "epoch": 0.18,
37
- "learning_rate": 0.00015,
38
- "loss": 0.6671,
39
- "step": 50
40
- },
41
- {
42
- "epoch": 0.22,
43
- "learning_rate": 0.00017999999999999998,
44
- "loss": 0.5676,
45
- "step": 60
46
- },
47
- {
48
- "epoch": 0.26,
49
- "learning_rate": 0.00020999999999999998,
50
- "loss": 0.5655,
51
- "step": 70
52
- },
53
- {
54
- "epoch": 0.29,
55
- "learning_rate": 0.00023999999999999998,
56
- "loss": 0.5251,
57
- "step": 80
58
- },
59
- {
60
- "epoch": 0.33,
61
- "learning_rate": 0.00027,
62
- "loss": 0.4845,
63
- "step": 90
64
- },
65
- {
66
- "epoch": 0.36,
67
- "learning_rate": 0.0003,
68
- "loss": 0.481,
69
- "step": 100
70
- },
71
- {
72
- "epoch": 0.4,
73
- "learning_rate": 0.0002976377952755905,
74
- "loss": 0.4565,
75
- "step": 110
76
- },
77
- {
78
- "epoch": 0.44,
79
- "learning_rate": 0.0002952755905511811,
80
- "loss": 0.4625,
81
- "step": 120
82
- },
83
- {
84
- "epoch": 0.47,
85
- "learning_rate": 0.00029291338582677163,
86
- "loss": 0.4584,
87
- "step": 130
88
- },
89
- {
90
- "epoch": 0.51,
91
- "learning_rate": 0.00029055118110236217,
92
- "loss": 0.4425,
93
- "step": 140
94
- },
95
- {
96
- "epoch": 0.55,
97
- "learning_rate": 0.0002881889763779527,
98
- "loss": 0.4573,
99
- "step": 150
100
- },
101
- {
102
- "epoch": 0.58,
103
- "learning_rate": 0.0002858267716535433,
104
- "loss": 0.4361,
105
- "step": 160
106
- },
107
- {
108
- "epoch": 0.62,
109
- "learning_rate": 0.00028346456692913383,
110
- "loss": 0.4396,
111
- "step": 170
112
- },
113
- {
114
- "epoch": 0.66,
115
- "learning_rate": 0.00028110236220472436,
116
- "loss": 0.4391,
117
- "step": 180
118
- },
119
- {
120
- "epoch": 0.69,
121
- "learning_rate": 0.00027874015748031495,
122
- "loss": 0.418,
123
- "step": 190
124
- },
125
- {
126
- "epoch": 0.73,
127
- "learning_rate": 0.0002763779527559055,
128
- "loss": 0.4469,
129
- "step": 200
130
- },
131
- {
132
- "epoch": 0.73,
133
- "eval_loss": 0.4269736409187317,
134
- "eval_runtime": 19.352,
135
- "eval_samples_per_second": 103.348,
136
- "eval_steps_per_second": 1.654,
137
- "step": 200
138
- },
139
- {
140
- "epoch": 0.77,
141
- "learning_rate": 0.0002740157480314961,
142
- "loss": 0.4149,
143
- "step": 210
144
- },
145
- {
146
- "epoch": 0.8,
147
- "learning_rate": 0.00027165354330708656,
148
- "loss": 0.428,
149
- "step": 220
150
- },
151
- {
152
- "epoch": 0.84,
153
- "learning_rate": 0.00026929133858267715,
154
- "loss": 0.4248,
155
- "step": 230
156
- },
157
- {
158
- "epoch": 0.88,
159
- "learning_rate": 0.0002669291338582677,
160
- "loss": 0.4249,
161
- "step": 240
162
- },
163
- {
164
- "epoch": 0.91,
165
- "learning_rate": 0.0002645669291338582,
166
- "loss": 0.4331,
167
- "step": 250
168
- },
169
- {
170
- "epoch": 0.95,
171
- "learning_rate": 0.0002622047244094488,
172
- "loss": 0.4192,
173
- "step": 260
174
- },
175
- {
176
- "epoch": 0.98,
177
- "learning_rate": 0.00025984251968503934,
178
- "loss": 0.4204,
179
- "step": 270
180
- },
181
- {
182
- "epoch": 1.02,
183
- "learning_rate": 0.00025748031496062993,
184
- "loss": 0.4318,
185
- "step": 280
186
- },
187
- {
188
- "epoch": 1.06,
189
- "learning_rate": 0.00025511811023622047,
190
- "loss": 0.4229,
191
- "step": 290
192
- },
193
- {
194
- "epoch": 1.09,
195
- "learning_rate": 0.000252755905511811,
196
- "loss": 0.4214,
197
- "step": 300
198
- },
199
- {
200
- "epoch": 1.13,
201
- "learning_rate": 0.00025039370078740154,
202
- "loss": 0.416,
203
- "step": 310
204
- },
205
- {
206
- "epoch": 1.17,
207
- "learning_rate": 0.00024803149606299207,
208
- "loss": 0.4199,
209
- "step": 320
210
- },
211
- {
212
- "epoch": 1.2,
213
- "learning_rate": 0.00024566929133858266,
214
- "loss": 0.4218,
215
- "step": 330
216
- },
217
- {
218
- "epoch": 1.24,
219
- "learning_rate": 0.0002433070866141732,
220
- "loss": 0.4113,
221
- "step": 340
222
- },
223
- {
224
- "epoch": 1.28,
225
- "learning_rate": 0.00024094488188976376,
226
- "loss": 0.4185,
227
- "step": 350
228
- },
229
- {
230
- "epoch": 1.31,
231
- "learning_rate": 0.00023858267716535432,
232
- "loss": 0.4168,
233
- "step": 360
234
- },
235
- {
236
- "epoch": 1.35,
237
- "learning_rate": 0.00023622047244094488,
238
- "loss": 0.4162,
239
- "step": 370
240
- },
241
- {
242
- "epoch": 1.39,
243
- "learning_rate": 0.0002338582677165354,
244
- "loss": 0.4175,
245
- "step": 380
246
- },
247
- {
248
- "epoch": 1.42,
249
- "learning_rate": 0.00023149606299212595,
250
- "loss": 0.4045,
251
- "step": 390
252
- },
253
- {
254
- "epoch": 1.46,
255
- "learning_rate": 0.00022913385826771652,
256
- "loss": 0.4152,
257
- "step": 400
258
- },
259
- {
260
- "epoch": 1.46,
261
- "eval_loss": 0.4086858630180359,
262
- "eval_runtime": 19.2818,
263
- "eval_samples_per_second": 103.725,
264
- "eval_steps_per_second": 1.66,
265
- "step": 400
266
- },
267
- {
268
- "epoch": 1.49,
269
- "learning_rate": 0.00022677165354330705,
270
- "loss": 0.415,
271
- "step": 410
272
- },
273
- {
274
- "epoch": 1.53,
275
- "learning_rate": 0.00022440944881889761,
276
- "loss": 0.4091,
277
- "step": 420
278
- },
279
- {
280
- "epoch": 1.57,
281
- "learning_rate": 0.00022204724409448818,
282
- "loss": 0.4132,
283
- "step": 430
284
- },
285
- {
286
- "epoch": 1.6,
287
- "learning_rate": 0.00021968503937007874,
288
- "loss": 0.3985,
289
- "step": 440
290
- },
291
- {
292
- "epoch": 1.64,
293
- "learning_rate": 0.00021732283464566927,
294
- "loss": 0.4056,
295
- "step": 450
296
- },
297
- {
298
- "epoch": 1.68,
299
- "learning_rate": 0.0002149606299212598,
300
- "loss": 0.4005,
301
- "step": 460
302
- },
303
- {
304
- "epoch": 1.71,
305
- "learning_rate": 0.00021259842519685037,
306
- "loss": 0.4059,
307
- "step": 470
308
- },
309
- {
310
- "epoch": 1.75,
311
- "learning_rate": 0.0002102362204724409,
312
- "loss": 0.409,
313
- "step": 480
314
- },
315
- {
316
- "epoch": 1.79,
317
- "learning_rate": 0.00020787401574803147,
318
- "loss": 0.4031,
319
- "step": 490
320
- },
321
- {
322
- "epoch": 1.82,
323
- "learning_rate": 0.00020551181102362203,
324
- "loss": 0.4097,
325
- "step": 500
326
- },
327
- {
328
- "epoch": 1.86,
329
- "learning_rate": 0.0002031496062992126,
330
- "loss": 0.4017,
331
- "step": 510
332
- },
333
- {
334
- "epoch": 1.9,
335
- "learning_rate": 0.00020078740157480313,
336
- "loss": 0.4026,
337
- "step": 520
338
- },
339
- {
340
- "epoch": 1.93,
341
- "learning_rate": 0.0001984251968503937,
342
- "loss": 0.4106,
343
- "step": 530
344
- },
345
- {
346
- "epoch": 1.97,
347
- "learning_rate": 0.00019606299212598423,
348
- "loss": 0.395,
349
- "step": 540
350
- },
351
- {
352
- "epoch": 2.01,
353
- "learning_rate": 0.0001937007874015748,
354
- "loss": 0.3988,
355
- "step": 550
356
- },
357
- {
358
- "epoch": 2.04,
359
- "learning_rate": 0.00019133858267716532,
360
- "loss": 0.409,
361
- "step": 560
362
- },
363
- {
364
- "epoch": 2.08,
365
- "learning_rate": 0.00018897637795275589,
366
- "loss": 0.3997,
367
- "step": 570
368
- },
369
- {
370
- "epoch": 2.11,
371
- "learning_rate": 0.00018661417322834645,
372
- "loss": 0.4007,
373
- "step": 580
374
- },
375
- {
376
- "epoch": 2.15,
377
- "learning_rate": 0.000184251968503937,
378
- "loss": 0.3905,
379
- "step": 590
380
- },
381
- {
382
- "epoch": 2.19,
383
- "learning_rate": 0.00018188976377952755,
384
- "loss": 0.4005,
385
- "step": 600
386
- },
387
- {
388
- "epoch": 2.19,
389
- "eval_loss": 0.40032637119293213,
390
- "eval_runtime": 19.2818,
391
- "eval_samples_per_second": 103.725,
392
- "eval_steps_per_second": 1.66,
393
- "step": 600
394
- },
395
- {
396
- "epoch": 2.22,
397
- "learning_rate": 0.0001795275590551181,
398
- "loss": 0.3983,
399
- "step": 610
400
- },
401
- {
402
- "epoch": 2.26,
403
- "learning_rate": 0.00017716535433070864,
404
- "loss": 0.3881,
405
- "step": 620
406
- },
407
- {
408
- "epoch": 2.3,
409
- "learning_rate": 0.00017480314960629918,
410
- "loss": 0.4008,
411
- "step": 630
412
- },
413
- {
414
- "epoch": 2.33,
415
- "learning_rate": 0.00017244094488188974,
416
- "loss": 0.3927,
417
- "step": 640
418
- },
419
- {
420
- "epoch": 2.37,
421
- "learning_rate": 0.0001700787401574803,
422
- "loss": 0.4005,
423
- "step": 650
424
- },
425
- {
426
- "epoch": 2.41,
427
- "learning_rate": 0.00016771653543307086,
428
- "loss": 0.3962,
429
- "step": 660
430
- },
431
- {
432
- "epoch": 2.44,
433
- "learning_rate": 0.0001653543307086614,
434
- "loss": 0.3902,
435
- "step": 670
436
- },
437
- {
438
- "epoch": 2.48,
439
- "learning_rate": 0.00016299212598425196,
440
- "loss": 0.3911,
441
- "step": 680
442
- },
443
- {
444
- "epoch": 2.52,
445
- "learning_rate": 0.00016062992125984252,
446
- "loss": 0.3891,
447
- "step": 690
448
- },
449
- {
450
- "epoch": 2.55,
451
- "learning_rate": 0.00015826771653543303,
452
- "loss": 0.3939,
453
- "step": 700
454
- },
455
- {
456
- "epoch": 2.59,
457
- "learning_rate": 0.0001559055118110236,
458
- "loss": 0.4001,
459
- "step": 710
460
- },
461
- {
462
- "epoch": 2.63,
463
- "learning_rate": 0.00015354330708661416,
464
- "loss": 0.3918,
465
- "step": 720
466
- },
467
- {
468
- "epoch": 2.66,
469
- "learning_rate": 0.00015118110236220472,
470
- "loss": 0.3979,
471
- "step": 730
472
- },
473
- {
474
- "epoch": 2.7,
475
- "learning_rate": 0.00014881889763779525,
476
- "loss": 0.3793,
477
- "step": 740
478
- },
479
- {
480
- "epoch": 2.73,
481
- "learning_rate": 0.00014645669291338582,
482
- "loss": 0.3879,
483
- "step": 750
484
- },
485
- {
486
- "epoch": 2.77,
487
- "learning_rate": 0.00014409448818897635,
488
- "loss": 0.3915,
489
- "step": 760
490
- },
491
- {
492
- "epoch": 2.81,
493
- "learning_rate": 0.00014173228346456691,
494
- "loss": 0.3831,
495
- "step": 770
496
- },
497
- {
498
- "epoch": 2.84,
499
- "learning_rate": 0.00013937007874015748,
500
- "loss": 0.3838,
501
- "step": 780
502
- },
503
- {
504
- "epoch": 2.88,
505
- "learning_rate": 0.00013700787401574804,
506
- "loss": 0.3734,
507
- "step": 790
508
- },
509
- {
510
- "epoch": 2.92,
511
- "learning_rate": 0.00013464566929133857,
512
- "loss": 0.3872,
513
- "step": 800
514
- },
515
- {
516
- "epoch": 2.92,
517
- "eval_loss": 0.3944130539894104,
518
- "eval_runtime": 19.2596,
519
- "eval_samples_per_second": 103.844,
520
- "eval_steps_per_second": 1.662,
521
- "step": 800
522
- },
523
- {
524
- "epoch": 2.95,
525
- "learning_rate": 0.0001322834645669291,
526
- "loss": 0.386,
527
- "step": 810
528
- },
529
- {
530
- "epoch": 2.99,
531
- "learning_rate": 0.00012992125984251967,
532
- "loss": 0.3799,
533
- "step": 820
534
- },
535
- {
536
- "epoch": 3.03,
537
- "learning_rate": 0.00012755905511811023,
538
- "loss": 0.3895,
539
- "step": 830
540
- },
541
- {
542
- "epoch": 3.06,
543
- "learning_rate": 0.00012519685039370077,
544
- "loss": 0.3852,
545
- "step": 840
546
- },
547
- {
548
- "epoch": 3.1,
549
- "learning_rate": 0.00012283464566929133,
550
- "loss": 0.3879,
551
- "step": 850
552
- },
553
- {
554
- "epoch": 3.14,
555
- "learning_rate": 0.00012047244094488188,
556
- "loss": 0.3892,
557
- "step": 860
558
- },
559
- {
560
- "epoch": 3.17,
561
- "learning_rate": 0.00011811023622047244,
562
- "loss": 0.3801,
563
- "step": 870
564
- },
565
- {
566
- "epoch": 3.21,
567
- "learning_rate": 0.00011574803149606298,
568
- "loss": 0.3802,
569
- "step": 880
570
- },
571
- {
572
- "epoch": 3.25,
573
- "learning_rate": 0.00011338582677165353,
574
- "loss": 0.3863,
575
- "step": 890
576
- },
577
- {
578
- "epoch": 3.28,
579
- "learning_rate": 0.00011102362204724409,
580
- "loss": 0.3792,
581
- "step": 900
582
- },
583
- {
584
- "epoch": 3.32,
585
- "learning_rate": 0.00010866141732283464,
586
- "loss": 0.3923,
587
- "step": 910
588
- },
589
- {
590
- "epoch": 3.35,
591
- "learning_rate": 0.00010629921259842519,
592
- "loss": 0.3753,
593
- "step": 920
594
- },
595
- {
596
- "epoch": 3.39,
597
- "learning_rate": 0.00010393700787401573,
598
- "loss": 0.3777,
599
- "step": 930
600
- },
601
- {
602
- "epoch": 3.43,
603
- "learning_rate": 0.0001015748031496063,
604
- "loss": 0.3849,
605
- "step": 940
606
- },
607
- {
608
- "epoch": 3.46,
609
- "learning_rate": 9.921259842519685e-05,
610
- "loss": 0.3775,
611
- "step": 950
612
- },
613
- {
614
- "epoch": 3.5,
615
- "learning_rate": 9.68503937007874e-05,
616
- "loss": 0.3853,
617
- "step": 960
618
- },
619
- {
620
- "epoch": 3.54,
621
- "learning_rate": 9.448818897637794e-05,
622
- "loss": 0.3719,
623
- "step": 970
624
- },
625
- {
626
- "epoch": 3.57,
627
- "learning_rate": 9.21259842519685e-05,
628
- "loss": 0.3779,
629
- "step": 980
630
- },
631
- {
632
- "epoch": 3.61,
633
- "learning_rate": 8.976377952755905e-05,
634
- "loss": 0.3921,
635
- "step": 990
636
- },
637
- {
638
- "epoch": 3.65,
639
- "learning_rate": 8.740157480314959e-05,
640
- "loss": 0.3776,
641
- "step": 1000
642
- },
643
- {
644
- "epoch": 3.65,
645
- "eval_loss": 0.3908761739730835,
646
- "eval_runtime": 19.2678,
647
- "eval_samples_per_second": 103.8,
648
- "eval_steps_per_second": 1.661,
649
- "step": 1000
650
- },
651
- {
652
- "epoch": 3.68,
653
- "learning_rate": 8.503937007874015e-05,
654
- "loss": 0.3889,
655
- "step": 1010
656
- },
657
- {
658
- "epoch": 3.72,
659
- "learning_rate": 8.26771653543307e-05,
660
- "loss": 0.3819,
661
- "step": 1020
662
- },
663
- {
664
- "epoch": 3.76,
665
- "learning_rate": 8.031496062992126e-05,
666
- "loss": 0.3758,
667
- "step": 1030
668
- },
669
- {
670
- "epoch": 3.79,
671
- "learning_rate": 7.79527559055118e-05,
672
- "loss": 0.3753,
673
- "step": 1040
674
- },
675
- {
676
- "epoch": 3.83,
677
- "learning_rate": 7.559055118110236e-05,
678
- "loss": 0.3737,
679
- "step": 1050
680
- },
681
- {
682
- "epoch": 3.87,
683
- "learning_rate": 7.322834645669291e-05,
684
- "loss": 0.3833,
685
- "step": 1060
686
- },
687
- {
688
- "epoch": 3.9,
689
- "learning_rate": 7.086614173228346e-05,
690
- "loss": 0.3625,
691
- "step": 1070
692
- },
693
- {
694
- "epoch": 3.94,
695
- "learning_rate": 6.850393700787402e-05,
696
- "loss": 0.3809,
697
- "step": 1080
698
- },
699
- {
700
- "epoch": 3.97,
701
- "learning_rate": 6.614173228346455e-05,
702
- "loss": 0.3751,
703
- "step": 1090
704
- },
705
- {
706
- "epoch": 4.01,
707
- "learning_rate": 6.377952755905512e-05,
708
- "loss": 0.3776,
709
- "step": 1100
710
- },
711
- {
712
- "epoch": 4.05,
713
- "learning_rate": 6.141732283464567e-05,
714
- "loss": 0.3748,
715
- "step": 1110
716
- },
717
- {
718
- "epoch": 4.08,
719
- "learning_rate": 5.905511811023622e-05,
720
- "loss": 0.3636,
721
- "step": 1120
722
- },
723
- {
724
- "epoch": 4.12,
725
- "learning_rate": 5.669291338582676e-05,
726
- "loss": 0.372,
727
- "step": 1130
728
- },
729
- {
730
- "epoch": 4.16,
731
- "learning_rate": 5.433070866141732e-05,
732
- "loss": 0.3795,
733
- "step": 1140
734
- },
735
- {
736
- "epoch": 4.19,
737
- "learning_rate": 5.196850393700787e-05,
738
- "loss": 0.3632,
739
- "step": 1150
740
- },
741
- {
742
- "epoch": 4.23,
743
- "learning_rate": 4.960629921259842e-05,
744
- "loss": 0.3806,
745
- "step": 1160
746
- },
747
- {
748
- "epoch": 4.27,
749
- "learning_rate": 4.724409448818897e-05,
750
- "loss": 0.3732,
751
- "step": 1170
752
- },
753
- {
754
- "epoch": 4.3,
755
- "learning_rate": 4.488188976377953e-05,
756
- "loss": 0.3818,
757
- "step": 1180
758
- },
759
- {
760
- "epoch": 4.34,
761
- "learning_rate": 4.2519685039370076e-05,
762
- "loss": 0.3766,
763
- "step": 1190
764
- },
765
- {
766
- "epoch": 4.38,
767
- "learning_rate": 4.015748031496063e-05,
768
- "loss": 0.3587,
769
- "step": 1200
770
- },
771
- {
772
- "epoch": 4.38,
773
- "eval_loss": 0.3883955776691437,
774
- "eval_runtime": 19.3219,
775
- "eval_samples_per_second": 103.51,
776
- "eval_steps_per_second": 1.656,
777
- "step": 1200
778
- }
779
- ],
780
- "logging_steps": 10,
781
- "max_steps": 1370,
782
- "num_train_epochs": 5,
783
- "save_steps": 200,
784
- "total_flos": 2.2400975031249142e+18,
785
- "trial_name": null,
786
- "trial_params": null
787
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
auxiliary_decoder/training_args.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:f06e9a16a0ff4f4a3e1848a5ee5ca4d1a18dde6f70aaad64595a234fd7300b7f
3
- size 4155