diff --git a/.gitattributes b/.gitattributes
index a6344aac8c09253b3b630fb776ae94478aa0275b..52373fe24473b1aa44333d318f578ae6bf04b49b 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text
diff --git a/README.md b/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..a022801f790dc00be27d03cc1b7fd1ca7af67b93
--- /dev/null
+++ b/README.md
@@ -0,0 +1,168 @@
+---
+tags:
+- unsloth
+base_model:
+- zai-org/GLM-4.7-Flash
+language:
+ - en
+ - zh
+library_name: transformers
+license: mit
+pipeline_tag: text-generation
+---
+> [!NOTE]
+> Includes Unsloth **chat template fixes**!
For `llama.cpp`, use `--jinja`
+>
+
+
+
+
+# GLM-4.7-Flash
+
+
+

+
+
+ 👋 Join our Discord community.
+
+ 📖 Check out the GLM-4.7 technical blog, technical report(GLM-4.5).
+
+ 📍 Use GLM-4.7-Flash API services on Z.ai API Platform.
+
+ 👉 One click to GLM-4.7.
+
+
+## Introduction
+
+GLM-4.7-Flash is a 30B-A3B MoE model. As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.
+
+
+### Performances on Benchmarks
+
+
+| Benchmark | GLM-4.7-Flash | Qwen3-30B-A3B-Thinking-2507 | GPT-OSS-20B |
+|--------------------|---------------|-----------------------------|-------------|
+| AIME 25 | 91.6 | 85.0 | 91.7 |
+| GPQA | 75.2 | 73.4 | 71.5 |
+| LCB v6 | 64.0 | 66.0 | 61.0 |
+| HLE | 14.4 | 9.8 | 10.9 |
+| SWE-bench Verified | 59.2 | 22.0 | 34.0 |
+| τ²-Bench | 79.5 | 49.0 | 47.7 |
+| BrowseComp | 42.8 | 2.29 | 28.3 |
+
+
+## Serve GLM-4.7-Flash Locally
+
+For local deployment, GLM-4.7-Flash supports inference frameworks including vLLM and SGLang. Comprehensive deployment
+instructions are available in the official [Github](https://github.com/zai-org/GLM-4.5) repository.
+
+vLLM and SGLang only support GLM-4.7-Flash on their main branches.
+
+### vLLM
+
++ using pip (must use pypi.org as the index url):
+
+```shell
+pip install -U vllm --pre --index-url https://pypi.org/simple --extra-index-url https://wheels.vllm.ai/nightly
+pip install git+https://github.com/huggingface/transformers.git
+```
+
+### SGLang
+
++ using pip install sglang from source, then update transformers to the latest main branch.
+
+### transformers
+
+using with transformers as
+
+```shell
+pip install git+https://github.com/huggingface/transformers.git
+```
+
+and then run:
+
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+MODEL_PATH = "zai-org/GLM-4.7-Flash"
+messages = [{"role": "user", "content": "hello"}]
+tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
+inputs = tokenizer.apply_chat_template(
+ messages,
+ tokenize=True,
+ add_generation_prompt=True,
+ return_dict=True,
+ return_tensors="pt",
+)
+model = AutoModelForCausalLM.from_pretrained(
+ pretrained_model_name_or_path=MODEL_PATH,
+ torch_dtype=torch.bfloat16,
+ device_map="auto",
+)
+inputs = inputs.to(model.device)
+generated_ids = model.generate(**inputs, max_new_tokens=128, do_sample=False)
+output_text = tokenizer.decode(generated_ids[0][inputs.input_ids.shape[1]:])
+print(output_text)
+```
+
+### vLLM
+
+```shell
+vllm serve zai-org/GLM-4.7-Flash \
+ --tensor-parallel-size 4 \
+ --speculative-config.method mtp \
+ --speculative-config.num_speculative_tokens 1 \
+ --tool-call-parser glm47 \
+ --reasoning-parser glm45 \
+ --enable-auto-tool-choice \
+ --served-model-name glm-4.7-flash
+```
+
+### SGLang
+
+```shell
+python3 -m sglang.launch_server \
+ --model-path zai-org/GLM-4.7-Flash \
+ --tp-size 4 \
+ --tool-call-parser glm47 \
+ --reasoning-parser glm45 \
+ --speculative-algorithm EAGLE \
+ --speculative-num-steps 3 \
+ --speculative-eagle-topk 1 \
+ --speculative-num-draft-tokens 4 \
+ --mem-fraction-static 0.8 \
+ --served-model-name glm-4.7-flash \
+ --host 0.0.0.0 \
+ --port 8000
+```
+
+## Citation
+
+If you find our work useful in your research, please consider citing the following paper:
+
+```bibtex
+@misc{5team2025glm45agenticreasoningcoding,
+ title={GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models},
+ author={GLM Team and Aohan Zeng and Xin Lv and Qinkai Zheng and Zhenyu Hou and Bin Chen and Chengxing Xie and Cunxiang Wang and Da Yin and Hao Zeng and Jiajie Zhang and Kedong Wang and Lucen Zhong and Mingdao Liu and Rui Lu and Shulin Cao and Xiaohan Zhang and Xuancheng Huang and Yao Wei and Yean Cheng and Yifan An and Yilin Niu and Yuanhao Wen and Yushi Bai and Zhengxiao Du and Zihan Wang and Zilin Zhu and Bohan Zhang and Bosi Wen and Bowen Wu and Bowen Xu and Can Huang and Casey Zhao and Changpeng Cai and Chao Yu and Chen Li and Chendi Ge and Chenghua Huang and Chenhui Zhang and Chenxi Xu and Chenzheng Zhu and Chuang Li and Congfeng Yin and Daoyan Lin and Dayong Yang and Dazhi Jiang and Ding Ai and Erle Zhu and Fei Wang and Gengzheng Pan and Guo Wang and Hailong Sun and Haitao Li and Haiyang Li and Haiyi Hu and Hanyu Zhang and Hao Peng and Hao Tai and Haoke Zhang and Haoran Wang and Haoyu Yang and He Liu and He Zhao and Hongwei Liu and Hongxi Yan and Huan Liu and Huilong Chen and Ji Li and Jiajing Zhao and Jiamin Ren and Jian Jiao and Jiani Zhao and Jianyang Yan and Jiaqi Wang and Jiayi Gui and Jiayue Zhao and Jie Liu and Jijie Li and Jing Li and Jing Lu and Jingsen Wang and Jingwei Yuan and Jingxuan Li and Jingzhao Du and Jinhua Du and Jinxin Liu and Junkai Zhi and Junli Gao and Ke Wang and Lekang Yang and Liang Xu and Lin Fan and Lindong Wu and Lintao Ding and Lu Wang and Man Zhang and Minghao Li and Minghuan Xu and Mingming Zhao and Mingshu Zhai and Pengfan Du and Qian Dong and Shangde Lei and Shangqing Tu and Shangtong Yang and Shaoyou Lu and Shijie Li and Shuang Li and Shuang-Li and Shuxun Yang and Sibo Yi and Tianshu Yu and Wei Tian and Weihan Wang and Wenbo Yu and Weng Lam Tam and Wenjie Liang and Wentao Liu and Xiao Wang and Xiaohan Jia and Xiaotao Gu and Xiaoying Ling and Xin Wang and Xing Fan and Xingru Pan and Xinyuan Zhang and Xinze Zhang and Xiuqing Fu and Xunkai Zhang and Yabo Xu and Yandong Wu and Yida Lu and Yidong Wang and Yilin Zhou and Yiming Pan and Ying Zhang and Yingli Wang and Yingru Li and Yinpei Su and Yipeng Geng and Yitong Zhu and Yongkun Yang and Yuhang Li and Yuhao Wu and Yujiang Li and Yunan Liu and Yunqing Wang and Yuntao Li and Yuxuan Zhang and Zezhen Liu and Zhen Yang and Zhengda Zhou and Zhongpei Qiao and Zhuoer Feng and Zhuorui Liu and Zichen Zhang and Zihan Wang and Zijun Yao and Zikang Wang and Ziqiang Liu and Ziwei Chai and Zixuan Li and Zuodong Zhao and Wenguang Chen and Jidong Zhai and Bin Xu and Minlie Huang and Hongning Wang and Juanzi Li and Yuxiao Dong and Jie Tang},
+ year={2025},
+ eprint={2508.06471},
+ archivePrefix={arXiv},
+ primaryClass={cs.CL},
+ url={https://arxiv.org/abs/2508.06471},
+}
\ No newline at end of file
diff --git a/chat_template.jinja b/chat_template.jinja
new file mode 100644
index 0000000000000000000000000000000000000000..2ab98ef068d62829d17c5ade1827b9f013fa2bbf
--- /dev/null
+++ b/chat_template.jinja
@@ -0,0 +1,86 @@
+[gMASK]
+{%- if tools -%}
+<|system|>
+# Tools
+
+You may call one or more functions to assist with the user query.
+
+You are provided with function signatures within XML tags:
+
+{% for tool in tools %}
+{{ tool | tojson(ensure_ascii=False) }}
+{% endfor %}
+
+
+For each function call, output the function name and arguments within the following XML format:
+{function-name}{arg-key-1}{arg-value-1}{arg-key-2}{arg-value-2}...{%- endif -%}
+{%- macro visible_text(content) -%}
+ {%- if content is string -%}
+ {{- content }}
+ {%- elif content is iterable and content is not mapping -%}
+ {%- for item in content -%}
+ {%- if item is mapping and item.type == 'text' -%}
+ {{- item.text }}
+ {%- elif item is string -%}
+ {{- item }}
+ {%- endif -%}
+ {%- endfor -%}
+ {%- else -%}
+ {{- content }}
+ {%- endif -%}
+{%- endmacro -%}
+{%- set ns = namespace(last_user_index=-1) %}
+{%- for m in messages %}
+ {%- if m.role == 'user' %}
+ {% set ns.last_user_index = loop.index0 -%}
+ {%- endif %}
+{%- endfor %}
+{% for m in messages %}
+{%- if m.role == 'user' -%}<|user|>{{ visible_text(m.content) }}
+{%- elif m.role == 'assistant' -%}
+<|assistant|>
+{%- set reasoning_content = '' %}
+{%- set content = visible_text(m.content) %}
+{%- if m.reasoning_content is string %}
+ {%- set reasoning_content = m.reasoning_content %}
+{%- else %}
+ {%- if '' in content %}
+ {%- set reasoning_content = content.split('')[0].rstrip('\n').split('')[-1].lstrip('\n') %}
+ {%- set content = content.split('')[-1].lstrip('\n') %}
+ {%- endif %}
+{%- endif %}
+{%- if ((clear_thinking is defined and not clear_thinking) or loop.index0 > ns.last_user_index) and reasoning_content -%}
+{{ '' + reasoning_content.strip() + ''}}
+{%- else -%}
+{{ '' }}
+{%- endif -%}
+{%- if content.strip() -%}
+{{ content.strip() }}
+{%- endif -%}
+{% if m.tool_calls %}
+{% for tc in m.tool_calls %}
+{%- if tc.function %}
+ {%- set tc = tc.function %}
+{%- endif %}
+{{- '' + tc.name -}}
+{% set _args = tc.arguments %}{% for k, v in _args.items() %}{{ k }}{{ v | tojson(ensure_ascii=False) if v is not string else v }}{% endfor %}{% endfor %}
+{% endif %}
+{%- elif m.role == 'tool' -%}
+{%- if m.content is string -%}
+{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+ {{- '<|observation|>' }}
+{%- endif %}
+{{- '' }}
+{{- m.content }}
+{{- '' }}
+{%- else -%}
+<|observation|>{% for tr in m.content %}
+{{ tr.output if tr.output is defined else tr }}{% endfor -%}
+{% endif -%}
+{%- elif m.role == 'system' -%}
+<|system|>{{ visible_text(m.content) }}
+{%- endif -%}
+{%- endfor -%}
+{%- if add_generation_prompt -%}
+ <|assistant|>{{- '' if (enable_thinking is defined and not enable_thinking) else '' -}}
+{%- endif -%}
\ No newline at end of file
diff --git a/config.json b/config.json
new file mode 100644
index 0000000000000000000000000000000000000000..43cfac7c427e8b6e11c275fe7fb3f67388809026
--- /dev/null
+++ b/config.json
@@ -0,0 +1,45 @@
+{
+ "architectures": [
+ "Glm4MoeLiteForCausalLM"
+ ],
+ "attention_bias": false,
+ "attention_dropout": 0.0,
+ "pad_token_id": 154820,
+ "eos_token_id": [
+ 154820,
+ 154827,
+ 154829
+ ],
+ "hidden_act": "silu",
+ "hidden_size": 2048,
+ "intermediate_size": 10240,
+ "max_position_embeddings": 202752,
+ "model_type": "glm4_moe_lite",
+ "moe_intermediate_size": 1536,
+ "topk_method": "noaux_tc",
+ "norm_topk_prob": true,
+ "num_attention_heads": 20,
+ "n_group": 1,
+ "topk_group": 1,
+ "n_routed_experts": 64,
+ "n_shared_experts": 1,
+ "routed_scaling_factor": 1.8,
+ "num_experts_per_tok": 4,
+ "first_k_dense_replace": 1,
+ "num_hidden_layers": 47,
+ "num_key_value_heads": 20,
+ "num_nextn_predict_layers": 1,
+ "partial_rotary_factor": 1.0,
+ "rms_norm_eps": 1e-05,
+ "rope_scaling": null,
+ "rope_theta": 1000000,
+ "tie_word_embeddings": false,
+ "torch_dtype": "bfloat16",
+ "transformers_version": "5.0.0rc0",
+ "q_lora_rank": 768,
+ "kv_lora_rank": 512,
+ "qk_nope_head_dim": 192,
+ "qk_rope_head_dim": 64,
+ "v_head_dim": 256,
+ "vocab_size": 154880
+}
\ No newline at end of file
diff --git a/generation_config.json b/generation_config.json
new file mode 100644
index 0000000000000000000000000000000000000000..1dfa4cbf88115f19b4c6fcb086cc5669d2ce5e57
--- /dev/null
+++ b/generation_config.json
@@ -0,0 +1,11 @@
+{
+ "_from_model_config": true,
+ "eos_token_id": [
+ 154820,
+ 154827,
+ 154829
+ ],
+ "pad_token_id": 154820,
+ "temperature": 1.0,
+ "transformers_version": "5.0.0.dev0"
+}
diff --git a/model-00001-of-00048.safetensors b/model-00001-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..79555e094b687b0c0a756194a914b9cfba4fb845
--- /dev/null
+++ b/model-00001-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:90abe0d075755853145c96906a1300f57c167fcc9aa67221239b448abf54933c
+size 1438134344
diff --git a/model-00002-of-00048.safetensors b/model-00002-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..997031ad328ea332bafdcc1ed60aac02048262a7
--- /dev/null
+++ b/model-00002-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8c51e2434efe609cbe652014a924e088a5ea97be35ca29cfa893a1a9a90304b1
+size 1270648128
diff --git a/model-00003-of-00048.safetensors b/model-00003-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..b40174d31d358efd9127a5b31c4a4b00ad1f803b
--- /dev/null
+++ b/model-00003-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ab6ebdd01af10c9bde1ec0b2a182d6f7ff8df831add6810695e5f3b50918040d
+size 1270648128
diff --git a/model-00004-of-00048.safetensors b/model-00004-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..1b0a6619cf5fc5a32776a4716f83daddc609376b
--- /dev/null
+++ b/model-00004-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:932e4dbb8f7583d2b6392a6e850dd7aa05636827e8b90ff7858cae709838dca7
+size 1270648128
diff --git a/model-00005-of-00048.safetensors b/model-00005-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..2da92dc95b670cf71495dd3cb16a6ecf523ec917
--- /dev/null
+++ b/model-00005-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7f56a3e52bf5905c2df07ef53575ae8e62622138a15b0affc2b3f84785ca81e9
+size 1270648128
diff --git a/model-00006-of-00048.safetensors b/model-00006-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..6997f1fb888c80f04d317e02dfbb114c48d74e0e
--- /dev/null
+++ b/model-00006-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:83aa0eeca6e5a9394bb8a1c9cab46cc603799d729cf64ed4bbc341e0a7085feb
+size 1270648128
diff --git a/model-00007-of-00048.safetensors b/model-00007-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..12e328dbc944dbfba15d7306817bf7f48bedc43e
--- /dev/null
+++ b/model-00007-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a9679076d3bd10c5f1287d12418fddb41c8b54b8c8ccc0672796ef7e8431b797
+size 1270648128
diff --git a/model-00008-of-00048.safetensors b/model-00008-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..58e42d5d15e86d7f4c8639e8964b505617bae692
--- /dev/null
+++ b/model-00008-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7d1231577a057ff957d480c277e497910acbba75fc9232f8fd8f44c1a647d7a2
+size 1270648128
diff --git a/model-00009-of-00048.safetensors b/model-00009-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..fdcac5f154114b539d8be24032d93963a2c41d63
--- /dev/null
+++ b/model-00009-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:095d626626773284535ba4114c2b320003ba3c8430bd104f5e7bba3772b2a1b1
+size 1270648128
diff --git a/model-00010-of-00048.safetensors b/model-00010-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..f1d2b7d3960a92e165d4c4dc1e458b774f5d71fe
--- /dev/null
+++ b/model-00010-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c3f8d197b0d253082f547d67989b1e3928a742b508f6f8d4abf119f3ba2990ed
+size 1270648128
diff --git a/model-00011-of-00048.safetensors b/model-00011-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..b8ec965155a3249502752883b7a4a7694c7a308d
--- /dev/null
+++ b/model-00011-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a8b7eeef625a6ed9ced479e064870f0ebf0da67056cd5e582300d0ebcf44de21
+size 1270648328
diff --git a/model-00012-of-00048.safetensors b/model-00012-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..eb8413120307cacac10adee29fb1a9636e9274f0
--- /dev/null
+++ b/model-00012-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:49adb8b9f02d1643ab22406b0e05938ce3d7a8986fd0981e0af5d4cd47ee8b12
+size 1270648328
diff --git a/model-00013-of-00048.safetensors b/model-00013-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..17e8b0917cb7dda545724eba28e17b50f4fbe3bc
--- /dev/null
+++ b/model-00013-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f269382ae6ce42cd05937fcd5551e08da151c810a87bb0e1994ff9bb5f9eb7c9
+size 1270648328
diff --git a/model-00014-of-00048.safetensors b/model-00014-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..ab153463bacfdcb5a838eb983f0da983c9ece15a
--- /dev/null
+++ b/model-00014-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:acd6f36aadceda658a5db6fc6a04160ae2ca1ae78d4a9c84aeabcea8e80b44c8
+size 1270648328
diff --git a/model-00015-of-00048.safetensors b/model-00015-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..2411b85d38cad605c45d391beccb8b1aaee7c018
--- /dev/null
+++ b/model-00015-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:606079fe1e000d658142c59fee66aa8d7c949d758612debd78d21cde2fae392c
+size 1270648328
diff --git a/model-00016-of-00048.safetensors b/model-00016-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..68962301a992f65ed0a1e9c9e1bf4ab54d7a95cf
--- /dev/null
+++ b/model-00016-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b53fb45f7689b505fc185180503ea9d45af67042ba38914e53bbdee97173c974
+size 1270648328
diff --git a/model-00017-of-00048.safetensors b/model-00017-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..01d6586877e539d465c460cd7d296c785b73e6db
--- /dev/null
+++ b/model-00017-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a51e478badd50bb9b5299e3ad7979bfd40da8ab96983de0cd16a0fb5fddb4bc9
+size 1270648328
diff --git a/model-00018-of-00048.safetensors b/model-00018-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..68dc7197418765aba3b493ee4a281e1f0913bde6
--- /dev/null
+++ b/model-00018-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:680295c83f29a90c97f2d68c1290a152330c2dbd3654abdf0f074e6d8018062c
+size 1270648328
diff --git a/model-00019-of-00048.safetensors b/model-00019-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..db04efb5a3c29dd02bad27cc8415e14c0ef00fb2
--- /dev/null
+++ b/model-00019-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:355f9437e291b0846b5ecac1d644766ec17bd4863da441e4e786f33d7bac3347
+size 1270648328
diff --git a/model-00020-of-00048.safetensors b/model-00020-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..baf9ea29efde5e248ec4c91786716c7590345a5e
--- /dev/null
+++ b/model-00020-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8f8d4698fc5ce834021254b407c61e5376a693dcafeb996a8fc708639b44aed8
+size 1270648328
diff --git a/model-00021-of-00048.safetensors b/model-00021-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..335e00c78b3cce10bfecc2f3232e345cba372c79
--- /dev/null
+++ b/model-00021-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:111570fc7f55bcb9beb7bcb969475b27ffbcad97b95788464ff91a1e07ff897a
+size 1270648328
diff --git a/model-00022-of-00048.safetensors b/model-00022-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..07a061e0e0934142427191ad333c4faf730e2b75
--- /dev/null
+++ b/model-00022-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ced6a73df393df3bd559fc69093763fb5cc9fedee31e245a52a7786939199896
+size 1270648328
diff --git a/model-00023-of-00048.safetensors b/model-00023-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..58373c6654392c0421fd30fdd8634382f2274c07
--- /dev/null
+++ b/model-00023-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b287bb0ce12a1ac975978c94548f2bbab3267e83e0c888871a3ada3ddf91405d
+size 1270648328
diff --git a/model-00024-of-00048.safetensors b/model-00024-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..c589464afa8126aa1ec30956b336b15a3a6114f5
--- /dev/null
+++ b/model-00024-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c6e77476f0f57dc0386b64741d9251632d46e646faab6780d10a5a6a5f456668
+size 1270648328
diff --git a/model-00025-of-00048.safetensors b/model-00025-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..90e8b33614347252ba7b173d756e2f5d13728783
--- /dev/null
+++ b/model-00025-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8f9e643d9d3e9152b9156424615c8054cffe6b8bc1b776567da17bbd3b97ebcb
+size 1270648328
diff --git a/model-00026-of-00048.safetensors b/model-00026-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..c3a69ccd49e63f282bf44eb16afb22242cd89512
--- /dev/null
+++ b/model-00026-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:634da561277b9b77410a319bd930e940d443055fb688a2e4b9574aae1569192d
+size 1270648328
diff --git a/model-00027-of-00048.safetensors b/model-00027-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..003975279dd3a8b514b3d16053cb38d33cb0776d
--- /dev/null
+++ b/model-00027-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3286537b5a5dd2b4742878476601a4f5b1f061e948386babe05076fd058fa5ec
+size 1270648328
diff --git a/model-00028-of-00048.safetensors b/model-00028-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..7f7c1f3479dc27ce604078be87e564c00587fde4
--- /dev/null
+++ b/model-00028-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:56335c906d44c1c6f37530d65182cd6af4fb146fa8a68e8e510d6419b5b007e7
+size 1270648328
diff --git a/model-00029-of-00048.safetensors b/model-00029-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..d28749af56eb5a710f183e00474cff20a35dde20
--- /dev/null
+++ b/model-00029-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3e6591dfb36a59918f91c152ba0baf184b4372c667fe46f74024e5d8ae86f378
+size 1270648328
diff --git a/model-00030-of-00048.safetensors b/model-00030-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..4e8b0750abdab36aa3e0253755186c6c73126898
--- /dev/null
+++ b/model-00030-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:62373b800980841e6ca5c4c48db570729246092d03ac756bdeb55ecfdd599a2c
+size 1270648328
diff --git a/model-00031-of-00048.safetensors b/model-00031-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..613b2e764dce2ad3203a225df42ad8b1d6932988
--- /dev/null
+++ b/model-00031-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:332d82cefb0b4d87c48e3e989c97d3e74b6f64326313694ed226a72bf26e3b99
+size 1270648328
diff --git a/model-00032-of-00048.safetensors b/model-00032-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..6ae2bca6bb87f5d388c8ce055dd591aacae6d23e
--- /dev/null
+++ b/model-00032-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0f9897915eef55f41b221ff072d8b0b401e447007d648b63ddabf9b795361c18
+size 1270648328
diff --git a/model-00033-of-00048.safetensors b/model-00033-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..b508d7e86e41dbc3703c34dd95145665bbdb5122
--- /dev/null
+++ b/model-00033-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:29059fb79e6fefaad49b7c2833370cd78079b403cc98bc28ce95976207840f7f
+size 1270648328
diff --git a/model-00034-of-00048.safetensors b/model-00034-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..423244000d188e4711e43f4d3ab25749185a352d
--- /dev/null
+++ b/model-00034-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f989cc8af19949a310599c5ab8d65f0ae8406908e1a702588b892604179061a1
+size 1270648328
diff --git a/model-00035-of-00048.safetensors b/model-00035-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..1bbcac89a658ea7146894bfa0ca8bacf02c4f78a
--- /dev/null
+++ b/model-00035-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:aaa91e9b1a075d1753e45a01db198ed48f2739392e46e1843a4dcaaab6cd02eb
+size 1270648328
diff --git a/model-00036-of-00048.safetensors b/model-00036-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..80f58c4558afd475957bdd8a0f2f016f81988932
--- /dev/null
+++ b/model-00036-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cf807da31533424a54196cd07da6caa1a22199b72d822bd8c2f387172b6e1c65
+size 1270648328
diff --git a/model-00037-of-00048.safetensors b/model-00037-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..5e476c6267dab00b72896cacba5d99a237b5df0a
--- /dev/null
+++ b/model-00037-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:887aeb5568ef6189a7618309e50f0277ad339da86133526ca6d820b88ecbb5d4
+size 1270648328
diff --git a/model-00038-of-00048.safetensors b/model-00038-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..cc7804c6900ab97ce51d7cd545204e4e4eb5d0a6
--- /dev/null
+++ b/model-00038-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:036a38cc10858a8cc5a2fcd09fdab1b53258d04ff87dc4d3d21f140c54b029dd
+size 1270648328
diff --git a/model-00039-of-00048.safetensors b/model-00039-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..24b4684c06c30707bfe4dbc93d7e452fa39ed46c
--- /dev/null
+++ b/model-00039-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:62ae341291fe720719ab6cdcf2348752f262f4575a3f8221b353deb224bcbd1d
+size 1270648328
diff --git a/model-00040-of-00048.safetensors b/model-00040-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..f12ecb438675ef82539b00d5cf9e3cb774e2d3bd
--- /dev/null
+++ b/model-00040-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cd8af33462a3f7b4dbc5393da4016acffabafc20b826f952e912adb2a2845fe3
+size 1270648328
diff --git a/model-00041-of-00048.safetensors b/model-00041-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..2182f0a8f533f4a69d8ad77aea87fff3ebc0f335
--- /dev/null
+++ b/model-00041-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:39098c9b6770ff5c185a70b8f0d6cfca22f95e3c47dc38de88c3d2ec2b1cf5bc
+size 1270648328
diff --git a/model-00042-of-00048.safetensors b/model-00042-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..379550a2efc703f67b9d19e2325b7c7b164154d4
--- /dev/null
+++ b/model-00042-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ffe0b23aab19b608ca59f46628064c3d53d846c81d6f978c2579b38e09e35747
+size 1270648328
diff --git a/model-00043-of-00048.safetensors b/model-00043-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..976c35fdae9c1a22274f0f3005d01ad444ed4ba6
--- /dev/null
+++ b/model-00043-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:efe1da68f8c30d7e77a68d48eb7d65d28709f536062dba8168490370cca47ba4
+size 1270648328
diff --git a/model-00044-of-00048.safetensors b/model-00044-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..b5b4ff5cff42a2bdde049511d5c84cc23c371363
--- /dev/null
+++ b/model-00044-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a731baa1370864ad61cf8fe8000cb8e1e266529206824a3ce96b5229eb3c53c8
+size 1270648328
diff --git a/model-00045-of-00048.safetensors b/model-00045-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..a898e8fd107eabeb2517ec1a7e192216903b534c
--- /dev/null
+++ b/model-00045-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a636f872a931396d2fdc57b6c715294e14b3c8040420ffce856a2a0a18194916
+size 1270648328
diff --git a/model-00046-of-00048.safetensors b/model-00046-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..801fa3f61579792d210ca77eb289b46ba63179d3
--- /dev/null
+++ b/model-00046-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ba5365e31c27cd13e330a7135f7251f3bf16e99c8504acffa9e8190b585eb773
+size 1270648328
diff --git a/model-00047-of-00048.safetensors b/model-00047-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..8786bf839075106343818c3dcb69ebc6cb3af933
--- /dev/null
+++ b/model-00047-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1bcc5d06065d2a564894657945ccfe9411762421c2c60acf91de31050cd4d84d
+size 2539429936
diff --git a/model-00048-of-00048.safetensors b/model-00048-of-00048.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..66dc5fc40d32acc196b9686568f96f35dc95a747
--- /dev/null
+++ b/model-00048-of-00048.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:35fff90a30ca808d86dc24f9e3eda119832ab69fb1f88ae4cccfbf0e5ee409a1
+size 1287438264
diff --git a/model.safetensors.index.json b/model.safetensors.index.json
new file mode 100644
index 0000000000000000000000000000000000000000..215064e6cfa2a56b41b5ef618d82b802a103cdfc
--- /dev/null
+++ b/model.safetensors.index.json
@@ -0,0 +1,9710 @@
+{
+ "metadata": {
+ "total_size": 31221488576
+ },
+ "weight_map": {
+ "model.embed_tokens.weight": "model-00001-of-00048.safetensors",
+ "model.layers.0.input_layernorm.weight": "model-00001-of-00048.safetensors",
+ "model.layers.0.mlp.down_proj.weight": "model-00001-of-00048.safetensors",
+ "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00048.safetensors",
+ "model.layers.0.mlp.up_proj.weight": "model-00001-of-00048.safetensors",
+ "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00048.safetensors",
+ "model.layers.0.self_attn.kv_a_layernorm.weight": "model-00001-of-00048.safetensors",
+ "model.layers.0.self_attn.kv_a_proj_with_mqa.weight": "model-00001-of-00048.safetensors",
+ "model.layers.0.self_attn.kv_b_proj.weight": "model-00001-of-00048.safetensors",
+ "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00048.safetensors",
+ "model.layers.0.self_attn.q_a_layernorm.weight": "model-00001-of-00048.safetensors",
+ "model.layers.0.self_attn.q_a_proj.weight": "model-00001-of-00048.safetensors",
+ "model.layers.0.self_attn.q_b_proj.weight": "model-00001-of-00048.safetensors",
+ "model.layers.47.embed_tokens.weight": "model-00001-of-00048.safetensors",
+ "model.layers.1.input_layernorm.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.0.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.0.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.0.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.1.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.1.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.1.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.10.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.10.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.10.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.11.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.11.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.11.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.12.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.12.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.12.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.13.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.13.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.13.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.14.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.14.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.14.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.15.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.15.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.15.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.16.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.16.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.16.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.17.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.17.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.17.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.18.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.18.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.18.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.19.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.19.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.19.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.2.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.2.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.2.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.20.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.20.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.20.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.21.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.21.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.21.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.22.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.22.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.22.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.23.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.23.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.23.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.24.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.24.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.24.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.25.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.25.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.25.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.26.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.26.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.26.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.27.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.27.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.27.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.28.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.28.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.28.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.29.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.29.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.29.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.3.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.3.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.3.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.30.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.30.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.30.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.31.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.31.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.31.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.32.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.32.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.32.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.33.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.33.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.33.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.34.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.34.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.34.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.35.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.35.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.35.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.36.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.36.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.36.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.37.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.37.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.37.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.38.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.38.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.38.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.39.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.39.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.39.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.4.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.4.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.4.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.40.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.40.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.40.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.41.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.41.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.41.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.42.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.42.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.42.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.43.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.43.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.43.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.44.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.44.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.44.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.45.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.45.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.45.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.46.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.46.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.46.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.47.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.47.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.47.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.48.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.48.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.48.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.49.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.49.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.49.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.5.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.5.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.5.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.50.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.50.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.50.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.51.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.51.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.51.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.52.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.52.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.52.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.53.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.53.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.53.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.54.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.54.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.54.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.55.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.55.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.55.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.56.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.56.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.56.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.57.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.57.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.57.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.58.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.58.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.58.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.59.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.59.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.59.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.6.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.6.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.6.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.60.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.60.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.60.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.61.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.61.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.61.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.62.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.62.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.62.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.63.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.63.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.63.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.7.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.7.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.7.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.8.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.8.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.8.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.9.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.9.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.experts.9.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.gate.e_score_correction_bias": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.gate.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.shared_experts.down_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.shared_experts.gate_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.mlp.shared_experts.up_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.post_attention_layernorm.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.self_attn.kv_a_layernorm.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.self_attn.kv_a_proj_with_mqa.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.self_attn.kv_b_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.self_attn.o_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.self_attn.q_a_layernorm.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.self_attn.q_a_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.1.self_attn.q_b_proj.weight": "model-00002-of-00048.safetensors",
+ "model.layers.2.input_layernorm.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.0.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.0.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.0.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.1.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.1.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.1.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.10.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.10.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.10.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.11.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.11.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.11.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.12.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.12.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.12.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.13.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.13.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.13.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.14.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.14.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.14.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.15.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.15.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.15.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.16.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.16.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.16.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.17.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.17.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.17.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.18.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.18.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.18.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.19.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.19.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.19.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.2.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.2.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.2.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.20.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.20.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.20.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.21.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.21.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.21.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.22.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.22.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.22.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.23.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.23.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.23.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.24.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.24.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.24.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.25.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.25.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.25.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.26.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.26.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.26.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.27.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.27.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.27.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.28.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.28.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.28.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.29.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.29.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.29.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.3.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.3.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.3.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.30.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.30.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.30.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.31.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.31.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.31.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.32.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.32.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.32.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.33.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.33.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.33.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.34.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.34.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.34.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.35.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.35.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.35.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.36.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.36.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.36.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.37.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.37.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.37.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.38.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.38.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.38.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.39.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.39.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.39.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.4.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.4.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.4.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.40.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.40.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.40.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.41.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.41.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.41.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.42.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.42.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.42.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.43.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.43.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.43.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.44.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.44.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.44.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.45.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.45.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.45.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.46.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.46.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.46.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.47.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.47.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.47.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.48.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.48.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.48.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.49.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.49.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.49.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.5.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.5.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.5.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.50.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.50.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.50.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.51.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.51.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.51.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.52.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.52.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.52.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.53.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.53.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.53.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.54.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.54.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.54.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.55.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.55.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.55.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.56.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.56.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.56.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.57.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.57.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.57.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.58.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.58.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.58.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.59.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.59.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.59.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.6.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.6.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.6.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.60.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.60.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.60.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.61.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.61.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.61.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.62.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.62.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.62.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.63.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.63.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.63.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.7.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.7.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.7.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.8.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.8.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.8.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.9.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.9.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.experts.9.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.gate.e_score_correction_bias": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.gate.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.shared_experts.down_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.shared_experts.gate_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.mlp.shared_experts.up_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.post_attention_layernorm.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.self_attn.kv_a_layernorm.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.self_attn.kv_a_proj_with_mqa.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.self_attn.kv_b_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.self_attn.o_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.self_attn.q_a_layernorm.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.self_attn.q_a_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.2.self_attn.q_b_proj.weight": "model-00003-of-00048.safetensors",
+ "model.layers.3.input_layernorm.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.0.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.0.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.0.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.1.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.1.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.1.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.10.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.10.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.10.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.11.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.11.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.11.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.12.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.12.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.12.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.13.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.13.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.13.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.14.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.14.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.14.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.15.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.15.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.15.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.16.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.16.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.16.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.17.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.17.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.17.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.18.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.18.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.18.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.19.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.19.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.19.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.2.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.2.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.2.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.20.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.20.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.20.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.21.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.21.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.21.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.22.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.22.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.22.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.23.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.23.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.23.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.24.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.24.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.24.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.25.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.25.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.25.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.26.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.26.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.26.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.27.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.27.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.27.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.28.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.28.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.28.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.29.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.29.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.29.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.3.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.3.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.3.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.30.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.30.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.30.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.31.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.31.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.31.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.32.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.32.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.32.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.33.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.33.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.33.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.34.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.34.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.34.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.35.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.35.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.35.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.36.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.36.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.36.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.37.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.37.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.37.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.38.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.38.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.38.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.39.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.39.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.39.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.4.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.4.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.4.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.40.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.40.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.40.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.41.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.41.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.41.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.42.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.42.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.42.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.43.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.43.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.43.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.44.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.44.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.44.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.45.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.45.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.45.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.46.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.46.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.46.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.47.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.47.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.47.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.48.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.48.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.48.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.49.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.49.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.49.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.5.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.5.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.5.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.50.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.50.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.50.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.51.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.51.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.51.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.52.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.52.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.52.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.53.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.53.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.53.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.54.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.54.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.54.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.55.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.55.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.55.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.56.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.56.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.56.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.57.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.57.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.57.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.58.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.58.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.58.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.59.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.59.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.59.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.6.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.6.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.6.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.60.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.60.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.60.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.61.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.61.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.61.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.62.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.62.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.62.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.63.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.63.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.63.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.7.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.7.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.7.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.8.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.8.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.8.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.9.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.9.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.experts.9.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.gate.e_score_correction_bias": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.gate.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.shared_experts.down_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.shared_experts.gate_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.mlp.shared_experts.up_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.post_attention_layernorm.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.self_attn.kv_a_layernorm.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.self_attn.kv_a_proj_with_mqa.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.self_attn.kv_b_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.self_attn.o_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.self_attn.q_a_layernorm.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.self_attn.q_a_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.3.self_attn.q_b_proj.weight": "model-00004-of-00048.safetensors",
+ "model.layers.4.input_layernorm.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.0.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.0.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.0.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.1.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.1.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.1.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.10.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.10.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.10.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.11.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.11.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.11.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.12.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.12.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.12.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.13.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.13.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.13.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.14.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.14.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.14.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.15.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.15.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.15.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.16.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.16.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.16.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.17.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.17.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.17.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.18.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.18.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.18.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.19.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.19.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.19.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.2.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.2.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.2.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.20.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.20.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.20.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.21.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.21.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.21.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.22.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.22.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.22.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.23.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.23.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.23.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.24.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.24.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.24.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.25.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.25.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.25.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.26.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.26.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.26.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.27.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.27.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.27.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.28.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.28.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.28.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.29.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.29.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.29.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.3.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.3.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.3.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.30.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.30.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.30.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.31.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.31.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.31.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.32.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.32.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.32.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.33.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.33.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.33.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.34.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.34.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.34.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.35.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.35.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.35.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.36.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.36.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.36.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.37.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.37.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.37.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.38.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.38.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.38.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.39.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.39.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.39.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.4.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.4.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.4.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.40.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.40.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.40.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.41.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.41.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.41.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.42.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.42.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.42.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.43.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.43.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.43.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.44.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.44.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.44.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.45.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.45.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.45.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.46.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.46.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.46.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.47.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.47.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.47.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.48.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.48.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.48.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.49.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.49.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.49.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.5.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.5.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.5.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.50.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.50.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.50.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.51.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.51.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.51.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.52.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.52.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.52.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.53.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.53.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.53.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.54.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.54.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.54.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.55.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.55.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.55.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.56.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.56.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.56.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.57.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.57.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.57.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.58.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.58.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.58.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.59.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.59.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.59.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.6.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.6.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.6.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.60.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.60.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.60.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.61.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.61.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.61.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.62.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.62.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.62.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.63.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.63.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.63.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.7.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.7.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.7.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.8.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.8.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.8.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.9.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.9.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.experts.9.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.gate.e_score_correction_bias": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.gate.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.shared_experts.down_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.shared_experts.gate_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.mlp.shared_experts.up_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.post_attention_layernorm.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.self_attn.kv_a_layernorm.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.self_attn.kv_a_proj_with_mqa.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.self_attn.kv_b_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.self_attn.o_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.self_attn.q_a_layernorm.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.self_attn.q_a_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.4.self_attn.q_b_proj.weight": "model-00005-of-00048.safetensors",
+ "model.layers.5.input_layernorm.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.0.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.0.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.0.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.1.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.1.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.1.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.10.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.10.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.10.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.11.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.11.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.11.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.12.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.12.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.12.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.13.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.13.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.13.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.14.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.14.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.14.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.15.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.15.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.15.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.16.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.16.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.16.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.17.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.17.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.17.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.18.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.18.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.18.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.19.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.19.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.19.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.2.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.2.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.2.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.20.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.20.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.20.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.21.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.21.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.21.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.22.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.22.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.22.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.23.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.23.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.23.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.24.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.24.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.24.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.25.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.25.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.25.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.26.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.26.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.26.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.27.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.27.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.27.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.28.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.28.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.28.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.29.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.29.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.29.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.3.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.3.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.3.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.30.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.30.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.30.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.31.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.31.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.31.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.32.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.32.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.32.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.33.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.33.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.33.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.34.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.34.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.34.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.35.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.35.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.35.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.36.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.36.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.36.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.37.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.37.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.37.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.38.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.38.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.38.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.39.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.39.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.39.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.4.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.4.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.4.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.40.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.40.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.40.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.41.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.41.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.41.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.42.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.42.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.42.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.43.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.43.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.43.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.44.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.44.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.44.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.45.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.45.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.45.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.46.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.46.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.46.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.47.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.47.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.47.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.48.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.48.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.48.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.49.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.49.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.49.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.5.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.5.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.5.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.50.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.50.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.50.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.51.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.51.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.51.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.52.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.52.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.52.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.53.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.53.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.53.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.54.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.54.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.54.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.55.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.55.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.55.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.56.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.56.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.56.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.57.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.57.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.57.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.58.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.58.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.58.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.59.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.59.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.59.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.6.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.6.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.6.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.60.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.60.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.60.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.61.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.61.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.61.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.62.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.62.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.62.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.63.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.63.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.63.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.7.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.7.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.7.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.8.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.8.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.8.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.9.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.9.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.experts.9.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.gate.e_score_correction_bias": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.gate.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.shared_experts.down_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.shared_experts.gate_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.mlp.shared_experts.up_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.post_attention_layernorm.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.self_attn.kv_a_layernorm.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.self_attn.kv_a_proj_with_mqa.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.self_attn.kv_b_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.self_attn.o_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.self_attn.q_a_layernorm.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.self_attn.q_a_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.5.self_attn.q_b_proj.weight": "model-00006-of-00048.safetensors",
+ "model.layers.6.input_layernorm.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.0.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.0.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.0.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.1.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.1.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.1.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.10.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.10.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.10.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.11.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.11.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.11.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.12.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.12.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.12.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.13.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.13.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.13.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.14.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.14.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.14.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.15.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.15.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.15.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.16.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.16.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.16.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.17.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.17.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.17.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.18.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.18.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.18.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.19.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.19.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.19.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.2.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.2.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.2.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.20.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.20.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.20.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.21.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.21.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.21.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.22.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.22.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.22.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.23.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.23.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.23.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.24.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.24.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.24.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.25.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.25.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.25.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.26.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.26.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.26.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.27.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.27.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.27.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.28.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.28.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.28.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.29.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.29.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.29.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.3.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.3.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.3.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.30.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.30.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.30.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.31.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.31.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.31.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.32.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.32.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.32.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.33.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.33.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.33.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.34.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.34.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.34.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.35.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.35.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.35.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.36.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.36.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.36.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.37.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.37.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.37.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.38.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.38.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.38.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.39.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.39.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.39.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.4.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.4.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.4.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.40.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.40.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.40.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.41.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.41.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.41.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.42.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.42.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.42.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.43.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.43.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.43.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.44.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.44.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.44.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.45.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.45.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.45.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.46.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.46.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.46.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.47.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.47.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.47.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.48.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.48.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.48.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.49.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.49.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.49.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.5.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.5.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.5.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.50.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.50.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.50.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.51.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.51.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.51.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.52.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.52.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.52.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.53.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.53.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.53.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.54.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.54.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.54.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.55.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.55.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.55.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.56.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.56.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.56.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.57.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.57.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.57.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.58.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.58.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.58.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.59.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.59.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.59.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.6.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.6.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.6.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.60.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.60.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.60.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.61.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.61.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.61.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.62.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.62.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.62.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.63.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.63.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.63.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.7.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.7.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.7.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.8.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.8.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.8.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.9.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.9.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.experts.9.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.gate.e_score_correction_bias": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.gate.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.shared_experts.down_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.shared_experts.gate_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.mlp.shared_experts.up_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.post_attention_layernorm.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.self_attn.kv_a_layernorm.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.self_attn.kv_a_proj_with_mqa.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.self_attn.kv_b_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.self_attn.o_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.self_attn.q_a_layernorm.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.self_attn.q_a_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.6.self_attn.q_b_proj.weight": "model-00007-of-00048.safetensors",
+ "model.layers.7.input_layernorm.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.0.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.0.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.0.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.1.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.1.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.1.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.10.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.10.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.10.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.11.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.11.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.11.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.12.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.12.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.12.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.13.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.13.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.13.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.14.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.14.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.14.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.15.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.15.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.15.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.16.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.16.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.16.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.17.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.17.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.17.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.18.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.18.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.18.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.19.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.19.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.19.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.2.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.2.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.2.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.20.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.20.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.20.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.21.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.21.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.21.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.22.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.22.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.22.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.23.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.23.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.23.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.24.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.24.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.24.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.25.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.25.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.25.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.26.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.26.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.26.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.27.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.27.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.27.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.28.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.28.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.28.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.29.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.29.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.29.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.3.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.3.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.3.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.30.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.30.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.30.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.31.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.31.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.31.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.32.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.32.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.32.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.33.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.33.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.33.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.34.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.34.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.34.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.35.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.35.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.35.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.36.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.36.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.36.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.37.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.37.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.37.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.38.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.38.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.38.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.39.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.39.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.39.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.4.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.4.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.4.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.40.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.40.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.40.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.41.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.41.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.41.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.42.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.42.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.42.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.43.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.43.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.43.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.44.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.44.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.44.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.45.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.45.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.45.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.46.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.46.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.46.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.47.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.47.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.47.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.48.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.48.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.48.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.49.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.49.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.49.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.5.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.5.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.5.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.50.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.50.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.50.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.51.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.51.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.51.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.52.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.52.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.52.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.53.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.53.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.53.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.54.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.54.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.54.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.55.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.55.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.55.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.56.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.56.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.56.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.57.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.57.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.57.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.58.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.58.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.58.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.59.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.59.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.59.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.6.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.6.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.6.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.60.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.60.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.60.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.61.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.61.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.61.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.62.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.62.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.62.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.63.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.63.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.63.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.7.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.7.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.7.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.8.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.8.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.8.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.9.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.9.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.experts.9.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.gate.e_score_correction_bias": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.gate.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.shared_experts.down_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.shared_experts.gate_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.mlp.shared_experts.up_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.post_attention_layernorm.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.self_attn.kv_a_layernorm.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.self_attn.kv_a_proj_with_mqa.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.self_attn.kv_b_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.self_attn.o_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.self_attn.q_a_layernorm.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.self_attn.q_a_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.7.self_attn.q_b_proj.weight": "model-00008-of-00048.safetensors",
+ "model.layers.8.input_layernorm.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.0.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.0.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.0.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.1.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.1.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.1.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.10.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.10.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.10.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.11.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.11.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.11.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.12.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.12.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.12.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.13.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.13.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.13.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.14.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.14.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.14.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.15.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.15.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.15.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.16.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.16.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.16.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.17.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.17.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.17.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.18.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.18.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.18.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.19.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.19.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.19.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.2.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.2.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.2.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.20.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.20.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.20.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.21.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.21.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.21.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.22.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.22.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.22.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.23.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.23.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.23.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.24.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.24.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.24.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.25.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.25.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.25.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.26.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.26.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.26.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.27.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.27.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.27.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.28.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.28.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.28.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.29.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.29.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.29.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.3.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.3.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.3.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.30.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.30.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.30.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.31.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.31.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.31.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.32.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.32.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.32.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.33.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.33.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.33.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.34.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.34.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.34.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.35.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.35.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.35.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.36.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.36.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.36.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.37.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.37.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.37.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.38.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.38.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.38.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.39.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.39.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.39.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.4.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.4.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.4.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.40.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.40.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.40.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.41.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.41.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.41.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.42.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.42.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.42.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.43.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.43.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.43.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.44.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.44.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.44.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.45.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.45.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.45.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.46.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.46.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.46.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.47.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.47.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.47.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.48.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.48.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.48.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.49.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.49.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.49.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.5.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.5.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.5.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.50.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.50.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.50.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.51.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.51.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.51.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.52.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.52.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.52.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.53.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.53.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.53.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.54.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.54.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.54.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.55.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.55.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.55.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.56.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.56.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.56.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.57.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.57.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.57.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.58.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.58.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.58.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.59.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.59.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.59.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.6.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.6.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.6.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.60.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.60.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.60.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.61.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.61.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.61.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.62.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.62.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.62.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.63.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.63.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.63.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.7.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.7.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.7.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.8.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.8.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.8.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.9.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.9.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.experts.9.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.gate.e_score_correction_bias": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.gate.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.shared_experts.down_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.shared_experts.gate_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.mlp.shared_experts.up_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.post_attention_layernorm.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.self_attn.kv_a_layernorm.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.self_attn.kv_a_proj_with_mqa.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.self_attn.kv_b_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.self_attn.o_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.self_attn.q_a_layernorm.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.self_attn.q_a_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.8.self_attn.q_b_proj.weight": "model-00009-of-00048.safetensors",
+ "model.layers.9.input_layernorm.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.0.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.0.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.0.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.1.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.1.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.1.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.10.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.10.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.10.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.11.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.11.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.11.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.12.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.12.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.12.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.13.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.13.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.13.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.14.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.14.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.14.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.15.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.15.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.15.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.16.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.16.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.16.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.17.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.17.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.17.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.18.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.18.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.18.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.19.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.19.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.19.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.2.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.2.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.2.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.20.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.20.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.20.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.21.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.21.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.21.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.22.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.22.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.22.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.23.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.23.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.23.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.24.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.24.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.24.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.25.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.25.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.25.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.26.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.26.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.26.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.27.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.27.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.27.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.28.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.28.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.28.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.29.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.29.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.29.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.3.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.3.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.3.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.30.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.30.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.30.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.31.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.31.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.31.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.32.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.32.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.32.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.33.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.33.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.33.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.34.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.34.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.34.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.35.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.35.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.35.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.36.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.36.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.36.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.37.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.37.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.37.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.38.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.38.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.38.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.39.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.39.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.39.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.4.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.4.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.4.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.40.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.40.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.40.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.41.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.41.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.41.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.42.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.42.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.42.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.43.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.43.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.43.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.44.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.44.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.44.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.45.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.45.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.45.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.46.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.46.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.46.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.47.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.47.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.47.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.48.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.48.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.48.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.49.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.49.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.49.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.5.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.5.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.5.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.50.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.50.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.50.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.51.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.51.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.51.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.52.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.52.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.52.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.53.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.53.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.53.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.54.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.54.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.54.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.55.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.55.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.55.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.56.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.56.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.56.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.57.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.57.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.57.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.58.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.58.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.58.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.59.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.59.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.59.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.6.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.6.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.6.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.60.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.60.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.60.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.61.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.61.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.61.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.62.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.62.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.62.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.63.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.63.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.63.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.7.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.7.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.7.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.8.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.8.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.8.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.9.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.9.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.experts.9.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.gate.e_score_correction_bias": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.gate.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.shared_experts.down_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.shared_experts.gate_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.mlp.shared_experts.up_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.post_attention_layernorm.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.self_attn.kv_a_layernorm.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.self_attn.kv_a_proj_with_mqa.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.self_attn.kv_b_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.self_attn.o_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.self_attn.q_a_layernorm.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.self_attn.q_a_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.9.self_attn.q_b_proj.weight": "model-00010-of-00048.safetensors",
+ "model.layers.10.input_layernorm.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.0.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.0.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.0.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.1.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.1.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.1.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.10.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.10.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.10.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.11.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.11.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.11.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.12.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.12.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.12.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.13.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.13.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.13.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.14.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.14.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.14.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.15.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.15.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.15.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.16.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.16.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.16.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.17.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.17.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.17.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.18.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.18.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.18.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.19.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.19.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.19.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.2.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.2.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.2.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.20.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.20.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.20.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.21.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.21.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.21.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.22.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.22.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.22.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.23.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.23.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.23.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.24.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.24.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.24.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.25.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.25.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.25.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.26.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.26.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.26.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.27.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.27.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.27.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.28.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.28.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.28.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.29.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.29.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.29.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.3.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.3.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.3.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.30.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.30.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.30.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.31.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.31.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.31.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.32.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.32.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.32.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.33.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.33.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.33.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.34.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.34.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.34.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.35.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.35.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.35.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.36.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.36.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.36.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.37.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.37.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.37.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.38.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.38.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.38.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.39.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.39.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.39.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.4.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.4.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.4.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.40.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.40.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.40.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.41.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.41.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.41.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.42.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.42.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.42.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.43.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.43.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.43.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.44.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.44.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.44.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.45.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.45.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.45.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.46.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.46.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.46.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.47.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.47.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.47.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.48.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.48.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.48.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.49.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.49.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.49.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.5.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.5.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.5.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.50.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.50.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.50.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.51.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.51.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.51.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.52.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.52.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.52.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.53.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.53.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.53.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.54.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.54.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.54.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.55.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.55.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.55.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.56.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.56.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.56.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.57.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.57.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.57.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.58.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.58.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.58.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.59.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.59.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.59.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.6.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.6.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.6.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.60.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.60.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.60.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.61.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.61.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.61.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.62.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.62.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.62.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.63.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.63.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.63.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.7.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.7.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.7.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.8.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.8.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.8.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.9.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.9.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.experts.9.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.gate.e_score_correction_bias": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.gate.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.shared_experts.down_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.shared_experts.gate_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.mlp.shared_experts.up_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.post_attention_layernorm.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.self_attn.kv_a_layernorm.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.self_attn.kv_a_proj_with_mqa.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.self_attn.kv_b_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.self_attn.o_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.self_attn.q_a_layernorm.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.self_attn.q_a_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.10.self_attn.q_b_proj.weight": "model-00011-of-00048.safetensors",
+ "model.layers.11.input_layernorm.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.0.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.0.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.0.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.1.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.1.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.1.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.10.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.10.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.10.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.11.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.11.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.11.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.12.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.12.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.12.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.13.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.13.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.13.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.14.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.14.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.14.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.15.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.15.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.15.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.16.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.16.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.16.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.17.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.17.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.17.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.18.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.18.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.18.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.19.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.19.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.19.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.2.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.2.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.2.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.20.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.20.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.20.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.21.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.21.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.21.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.22.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.22.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.22.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.23.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.23.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.23.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.24.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.24.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.24.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.25.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.25.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.25.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.26.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.26.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.26.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.27.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.27.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.27.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.28.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.28.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.28.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.29.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.29.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.29.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.3.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.3.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.3.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.30.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.30.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.30.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.31.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.31.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.31.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.32.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.32.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.32.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.33.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.33.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.33.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.34.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.34.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.34.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.35.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.35.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.35.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.36.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.36.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.36.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.37.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.37.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.37.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.38.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.38.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.38.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.39.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.39.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.39.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.4.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.4.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.4.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.40.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.40.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.40.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.41.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.41.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.41.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.42.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.42.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.42.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.43.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.43.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.43.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.44.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.44.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.44.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.45.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.45.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.45.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.46.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.46.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.46.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.47.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.47.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.47.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.48.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.48.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.48.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.49.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.49.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.49.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.5.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.5.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.5.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.50.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.50.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.50.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.51.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.51.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.51.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.52.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.52.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.52.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.53.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.53.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.53.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.54.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.54.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.54.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.55.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.55.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.55.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.56.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.56.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.56.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.57.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.57.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.57.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.58.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.58.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.58.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.59.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.59.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.59.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.6.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.6.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.6.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.60.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.60.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.60.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.61.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.61.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.61.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.62.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.62.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.62.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.63.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.63.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.63.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.7.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.7.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.7.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.8.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.8.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.8.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.9.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.9.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.experts.9.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.gate.e_score_correction_bias": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.gate.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.shared_experts.down_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.shared_experts.gate_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.mlp.shared_experts.up_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.post_attention_layernorm.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.self_attn.kv_a_layernorm.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.self_attn.kv_a_proj_with_mqa.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.self_attn.kv_b_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.self_attn.o_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.self_attn.q_a_layernorm.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.self_attn.q_a_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.11.self_attn.q_b_proj.weight": "model-00012-of-00048.safetensors",
+ "model.layers.12.input_layernorm.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.0.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.0.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.0.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.1.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.1.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.1.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.10.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.10.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.10.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.11.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.11.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.11.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.12.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.12.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.12.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.13.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.13.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.13.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.14.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.14.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.14.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.15.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.15.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.15.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.16.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.16.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.16.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.17.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.17.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.17.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.18.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.18.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.18.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.19.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.19.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.19.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.2.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.2.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.2.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.20.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.20.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.20.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.21.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.21.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.21.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.22.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.22.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.22.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.23.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.23.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.23.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.24.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.24.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.24.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.25.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.25.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.25.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.26.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.26.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.26.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.27.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.27.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.27.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.28.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.28.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.28.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.29.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.29.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.29.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.3.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.3.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.3.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.30.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.30.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.30.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.31.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.31.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.31.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.32.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.32.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.32.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.33.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.33.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.33.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.34.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.34.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.34.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.35.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.35.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.35.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.36.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.36.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.36.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.37.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.37.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.37.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.38.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.38.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.38.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.39.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.39.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.39.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.4.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.4.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.4.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.40.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.40.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.40.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.41.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.41.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.41.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.42.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.42.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.42.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.43.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.43.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.43.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.44.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.44.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.44.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.45.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.45.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.45.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.46.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.46.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.46.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.47.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.47.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.47.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.48.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.48.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.48.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.49.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.49.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.49.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.5.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.5.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.5.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.50.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.50.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.50.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.51.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.51.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.51.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.52.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.52.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.52.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.53.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.53.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.53.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.54.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.54.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.54.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.55.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.55.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.55.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.56.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.56.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.56.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.57.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.57.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.57.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.58.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.58.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.58.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.59.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.59.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.59.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.6.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.6.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.6.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.60.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.60.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.60.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.61.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.61.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.61.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.62.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.62.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.62.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.63.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.63.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.63.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.7.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.7.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.7.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.8.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.8.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.8.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.9.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.9.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.experts.9.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.gate.e_score_correction_bias": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.gate.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.shared_experts.down_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.shared_experts.gate_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.mlp.shared_experts.up_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.post_attention_layernorm.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.self_attn.kv_a_layernorm.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.self_attn.kv_a_proj_with_mqa.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.self_attn.kv_b_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.self_attn.o_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.self_attn.q_a_layernorm.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.self_attn.q_a_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.12.self_attn.q_b_proj.weight": "model-00013-of-00048.safetensors",
+ "model.layers.13.input_layernorm.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.0.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.0.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.0.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.1.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.1.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.1.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.10.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.10.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.10.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.11.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.11.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.11.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.12.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.12.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.12.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.13.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.13.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.13.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.14.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.14.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.14.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.15.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.15.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.15.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.16.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.16.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.16.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.17.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.17.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.17.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.18.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.18.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.18.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.19.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.19.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.19.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.2.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.2.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.2.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.20.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.20.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.20.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.21.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.21.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.21.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.22.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.22.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.22.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.23.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.23.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.23.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.24.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.24.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.24.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.25.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.25.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.25.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.26.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.26.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.26.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.27.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.27.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.27.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.28.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.28.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.28.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.29.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.29.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.29.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.3.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.3.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.3.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.30.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.30.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.30.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.31.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.31.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.31.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.32.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.32.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.32.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.33.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.33.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.33.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.34.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.34.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.34.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.35.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.35.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.35.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.36.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.36.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.36.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.37.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.37.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.37.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.38.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.38.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.38.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.39.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.39.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.39.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.4.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.4.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.4.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.40.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.40.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.40.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.41.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.41.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.41.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.42.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.42.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.42.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.43.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.43.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.43.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.44.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.44.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.44.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.45.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.45.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.45.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.46.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.46.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.46.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.47.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.47.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.47.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.48.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.48.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.48.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.49.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.49.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.49.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.5.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.5.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.5.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.50.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.50.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.50.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.51.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.51.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.51.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.52.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.52.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.52.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.53.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.53.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.53.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.54.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.54.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.54.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.55.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.55.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.55.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.56.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.56.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.56.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.57.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.57.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.57.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.58.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.58.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.58.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.59.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.59.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.59.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.6.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.6.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.6.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.60.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.60.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.60.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.61.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.61.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.61.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.62.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.62.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.62.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.63.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.63.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.63.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.7.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.7.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.7.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.8.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.8.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.8.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.9.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.9.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.experts.9.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.gate.e_score_correction_bias": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.gate.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.shared_experts.down_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.shared_experts.gate_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.mlp.shared_experts.up_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.post_attention_layernorm.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.self_attn.kv_a_layernorm.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.self_attn.kv_a_proj_with_mqa.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.self_attn.kv_b_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.self_attn.o_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.self_attn.q_a_layernorm.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.self_attn.q_a_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.13.self_attn.q_b_proj.weight": "model-00014-of-00048.safetensors",
+ "model.layers.14.input_layernorm.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.0.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.0.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.0.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.1.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.1.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.1.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.10.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.10.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.10.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.11.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.11.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.11.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.12.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.12.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.12.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.13.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.13.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.13.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.14.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.14.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.14.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.15.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.15.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.15.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.16.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.16.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.16.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.17.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.17.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.17.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.18.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.18.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.18.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.19.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.19.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.19.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.2.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.2.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.2.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.20.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.20.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.20.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.21.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.21.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.21.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.22.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.22.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.22.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.23.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.23.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.23.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.24.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.24.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.24.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.25.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.25.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.25.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.26.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.26.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.26.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.27.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.27.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.27.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.28.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.28.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.28.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.29.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.29.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.29.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.3.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.3.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.3.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.30.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.30.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.30.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.31.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.31.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.31.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.32.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.32.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.32.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.33.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.33.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.33.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.34.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.34.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.34.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.35.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.35.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.35.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.36.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.36.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.36.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.37.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.37.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.37.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.38.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.38.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.38.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.39.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.39.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.39.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.4.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.4.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.4.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.40.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.40.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.40.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.41.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.41.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.41.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.42.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.42.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.42.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.43.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.43.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.43.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.44.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.44.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.44.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.45.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.45.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.45.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.46.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.46.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.46.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.47.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.47.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.47.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.48.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.48.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.48.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.49.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.49.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.49.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.5.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.5.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.5.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.50.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.50.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.50.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.51.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.51.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.51.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.52.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.52.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.52.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.53.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.53.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.53.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.54.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.54.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.54.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.55.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.55.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.55.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.56.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.56.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.56.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.57.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.57.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.57.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.58.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.58.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.58.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.59.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.59.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.59.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.6.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.6.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.6.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.60.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.60.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.60.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.61.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.61.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.61.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.62.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.62.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.62.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.63.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.63.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.63.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.7.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.7.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.7.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.8.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.8.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.8.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.9.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.9.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.experts.9.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.gate.e_score_correction_bias": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.gate.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.shared_experts.down_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.shared_experts.gate_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.mlp.shared_experts.up_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.post_attention_layernorm.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.self_attn.kv_a_layernorm.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.self_attn.kv_a_proj_with_mqa.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.self_attn.kv_b_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.self_attn.o_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.self_attn.q_a_layernorm.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.self_attn.q_a_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.14.self_attn.q_b_proj.weight": "model-00015-of-00048.safetensors",
+ "model.layers.15.input_layernorm.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.0.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.0.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.0.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.1.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.1.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.1.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.10.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.10.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.10.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.11.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.11.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.11.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.12.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.12.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.12.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.13.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.13.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.13.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.14.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.14.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.14.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.15.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.15.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.15.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.16.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.16.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.16.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.17.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.17.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.17.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.18.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.18.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.18.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.19.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.19.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.19.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.2.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.2.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.2.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.20.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.20.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.20.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.21.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.21.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.21.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.22.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.22.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.22.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.23.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.23.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.23.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.24.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.24.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.24.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.25.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.25.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.25.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.26.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.26.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.26.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.27.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.27.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.27.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.28.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.28.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.28.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.29.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.29.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.29.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.3.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.3.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.3.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.30.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.30.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.30.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.31.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.31.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.31.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.32.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.32.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.32.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.33.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.33.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.33.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.34.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.34.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.34.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.35.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.35.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.35.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.36.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.36.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.36.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.37.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.37.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.37.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.38.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.38.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.38.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.39.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.39.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.39.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.4.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.4.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.4.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.40.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.40.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.40.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.41.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.41.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.41.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.42.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.42.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.42.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.43.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.43.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.43.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.44.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.44.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.44.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.45.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.45.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.45.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.46.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.46.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.46.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.47.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.47.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.47.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.48.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.48.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.48.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.49.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.49.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.49.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.5.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.5.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.5.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.50.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.50.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.50.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.51.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.51.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.51.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.52.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.52.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.52.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.53.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.53.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.53.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.54.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.54.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.54.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.55.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.55.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.55.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.56.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.56.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.56.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.57.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.57.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.57.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.58.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.58.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.58.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.59.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.59.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.59.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.6.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.6.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.6.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.60.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.60.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.60.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.61.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.61.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.61.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.62.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.62.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.62.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.63.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.63.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.63.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.7.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.7.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.7.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.8.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.8.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.8.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.9.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.9.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.experts.9.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.gate.e_score_correction_bias": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.gate.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.shared_experts.down_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.shared_experts.gate_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.mlp.shared_experts.up_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.post_attention_layernorm.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.self_attn.kv_a_layernorm.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.self_attn.kv_a_proj_with_mqa.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.self_attn.kv_b_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.self_attn.o_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.self_attn.q_a_layernorm.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.self_attn.q_a_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.15.self_attn.q_b_proj.weight": "model-00016-of-00048.safetensors",
+ "model.layers.16.input_layernorm.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.0.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.0.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.0.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.1.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.1.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.1.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.10.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.10.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.10.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.11.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.11.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.11.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.12.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.12.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.12.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.13.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.13.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.13.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.14.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.14.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.14.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.15.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.15.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.15.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.16.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.16.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.16.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.17.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.17.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.17.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.18.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.18.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.18.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.19.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.19.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.19.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.2.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.2.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.2.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.20.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.20.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.20.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.21.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.21.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.21.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.22.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.22.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.22.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.23.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.23.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.23.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.24.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.24.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.24.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.25.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.25.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.25.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.26.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.26.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.26.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.27.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.27.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.27.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.28.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.28.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.28.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.29.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.29.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.29.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.3.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.3.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.3.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.30.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.30.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.30.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.31.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.31.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.31.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.32.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.32.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.32.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.33.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.33.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.33.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.34.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.34.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.34.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.35.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.35.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.35.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.36.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.36.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.36.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.37.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.37.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.37.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.38.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.38.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.38.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.39.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.39.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.39.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.4.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.4.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.4.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.40.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.40.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.40.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.41.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.41.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.41.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.42.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.42.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.42.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.43.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.43.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.43.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.44.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.44.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.44.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.45.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.45.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.45.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.46.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.46.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.46.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.47.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.47.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.47.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.48.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.48.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.48.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.49.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.49.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.49.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.5.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.5.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.5.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.50.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.50.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.50.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.51.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.51.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.51.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.52.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.52.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.52.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.53.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.53.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.53.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.54.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.54.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.54.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.55.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.55.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.55.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.56.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.56.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.56.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.57.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.57.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.57.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.58.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.58.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.58.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.59.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.59.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.59.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.6.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.6.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.6.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.60.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.60.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.60.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.61.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.61.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.61.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.62.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.62.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.62.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.63.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.63.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.63.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.7.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.7.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.7.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.8.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.8.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.8.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.9.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.9.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.experts.9.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.gate.e_score_correction_bias": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.gate.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.shared_experts.down_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.shared_experts.gate_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.mlp.shared_experts.up_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.post_attention_layernorm.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.self_attn.kv_a_layernorm.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.self_attn.kv_a_proj_with_mqa.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.self_attn.kv_b_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.self_attn.o_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.self_attn.q_a_layernorm.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.self_attn.q_a_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.16.self_attn.q_b_proj.weight": "model-00017-of-00048.safetensors",
+ "model.layers.17.input_layernorm.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.0.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.0.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.0.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.1.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.1.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.1.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.10.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.10.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.10.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.11.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.11.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.11.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.12.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.12.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.12.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.13.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.13.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.13.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.14.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.14.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.14.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.15.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.15.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.15.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.16.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.16.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.16.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.17.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.17.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.17.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.18.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.18.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.18.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.19.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.19.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.19.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.2.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.2.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.2.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.20.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.20.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.20.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.21.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.21.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.21.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.22.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.22.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.22.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.23.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.23.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.23.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.24.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.24.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.24.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.25.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.25.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.25.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.26.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.26.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.26.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.27.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.27.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.27.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.28.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.28.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.28.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.29.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.29.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.29.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.3.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.3.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.3.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.30.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.30.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.30.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.31.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.31.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.31.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.32.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.32.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.32.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.33.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.33.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.33.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.34.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.34.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.34.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.35.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.35.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.35.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.36.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.36.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.36.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.37.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.37.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.37.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.38.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.38.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.38.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.39.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.39.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.39.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.4.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.4.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.4.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.40.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.40.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.40.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.41.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.41.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.41.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.42.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.42.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.42.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.43.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.43.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.43.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.44.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.44.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.44.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.45.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.45.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.45.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.46.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.46.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.46.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.47.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.47.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.47.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.48.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.48.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.48.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.49.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.49.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.49.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.5.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.5.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.5.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.50.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.50.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.50.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.51.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.51.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.51.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.52.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.52.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.52.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.53.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.53.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.53.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.54.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.54.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.54.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.55.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.55.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.55.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.56.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.56.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.56.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.57.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.57.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.57.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.58.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.58.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.58.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.59.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.59.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.59.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.6.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.6.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.6.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.60.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.60.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.60.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.61.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.61.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.61.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.62.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.62.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.62.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.63.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.63.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.63.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.7.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.7.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.7.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.8.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.8.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.8.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.9.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.9.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.experts.9.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.gate.e_score_correction_bias": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.gate.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.shared_experts.down_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.shared_experts.gate_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.mlp.shared_experts.up_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.post_attention_layernorm.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.self_attn.kv_a_layernorm.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.self_attn.kv_a_proj_with_mqa.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.self_attn.kv_b_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.self_attn.o_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.self_attn.q_a_layernorm.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.self_attn.q_a_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.17.self_attn.q_b_proj.weight": "model-00018-of-00048.safetensors",
+ "model.layers.18.input_layernorm.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.0.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.0.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.0.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.1.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.1.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.1.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.10.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.10.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.10.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.11.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.11.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.11.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.12.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.12.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.12.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.13.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.13.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.13.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.14.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.14.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.14.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.15.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.15.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.15.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.16.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.16.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.16.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.17.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.17.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.17.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.18.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.18.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.18.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.19.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.19.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.19.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.2.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.2.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.2.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.20.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.20.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.20.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.21.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.21.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.21.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.22.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.22.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.22.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.23.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.23.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.23.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.24.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.24.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.24.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.25.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.25.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.25.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.26.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.26.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.26.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.27.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.27.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.27.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.28.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.28.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.28.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.29.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.29.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.29.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.3.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.3.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.3.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.30.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.30.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.30.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.31.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.31.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.31.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.32.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.32.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.32.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.33.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.33.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.33.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.34.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.34.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.34.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.35.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.35.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.35.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.36.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.36.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.36.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.37.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.37.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.37.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.38.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.38.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.38.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.39.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.39.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.39.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.4.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.4.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.4.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.40.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.40.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.40.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.41.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.41.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.41.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.42.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.42.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.42.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.43.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.43.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.43.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.44.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.44.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.44.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.45.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.45.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.45.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.46.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.46.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.46.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.47.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.47.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.47.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.48.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.48.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.48.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.49.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.49.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.49.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.5.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.5.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.5.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.50.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.50.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.50.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.51.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.51.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.51.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.52.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.52.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.52.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.53.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.53.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.53.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.54.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.54.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.54.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.55.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.55.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.55.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.56.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.56.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.56.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.57.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.57.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.57.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.58.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.58.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.58.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.59.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.59.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.59.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.6.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.6.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.6.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.60.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.60.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.60.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.61.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.61.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.61.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.62.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.62.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.62.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.63.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.63.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.63.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.7.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.7.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.7.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.8.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.8.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.8.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.9.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.9.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.experts.9.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.gate.e_score_correction_bias": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.gate.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.shared_experts.down_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.shared_experts.gate_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.mlp.shared_experts.up_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.post_attention_layernorm.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.self_attn.kv_a_layernorm.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.self_attn.kv_a_proj_with_mqa.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.self_attn.kv_b_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.self_attn.o_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.self_attn.q_a_layernorm.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.self_attn.q_a_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.18.self_attn.q_b_proj.weight": "model-00019-of-00048.safetensors",
+ "model.layers.19.input_layernorm.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.0.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.0.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.0.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.1.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.1.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.1.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.10.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.10.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.10.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.11.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.11.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.11.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.12.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.12.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.12.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.13.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.13.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.13.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.14.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.14.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.14.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.15.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.15.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.15.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.16.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.16.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.16.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.17.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.17.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.17.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.18.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.18.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.18.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.19.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.19.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.19.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.2.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.2.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.2.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.20.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.20.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.20.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.21.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.21.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.21.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.22.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.22.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.22.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.23.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.23.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.23.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.24.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.24.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.24.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.25.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.25.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.25.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.26.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.26.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.26.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.27.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.27.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.27.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.28.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.28.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.28.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.29.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.29.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.29.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.3.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.3.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.3.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.30.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.30.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.30.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.31.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.31.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.31.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.32.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.32.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.32.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.33.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.33.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.33.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.34.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.34.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.34.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.35.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.35.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.35.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.36.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.36.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.36.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.37.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.37.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.37.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.38.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.38.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.38.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.39.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.39.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.39.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.4.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.4.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.4.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.40.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.40.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.40.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.41.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.41.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.41.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.42.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.42.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.42.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.43.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.43.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.43.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.44.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.44.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.44.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.45.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.45.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.45.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.46.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.46.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.46.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.47.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.47.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.47.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.48.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.48.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.48.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.49.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.49.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.49.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.5.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.5.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.5.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.50.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.50.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.50.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.51.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.51.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.51.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.52.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.52.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.52.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.53.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.53.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.53.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.54.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.54.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.54.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.55.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.55.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.55.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.56.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.56.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.56.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.57.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.57.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.57.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.58.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.58.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.58.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.59.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.59.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.59.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.6.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.6.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.6.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.60.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.60.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.60.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.61.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.61.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.61.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.62.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.62.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.62.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.63.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.63.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.63.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.7.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.7.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.7.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.8.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.8.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.8.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.9.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.9.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.experts.9.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.gate.e_score_correction_bias": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.gate.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.shared_experts.down_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.shared_experts.gate_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.mlp.shared_experts.up_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.post_attention_layernorm.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.self_attn.kv_a_layernorm.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.self_attn.kv_a_proj_with_mqa.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.self_attn.kv_b_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.self_attn.o_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.self_attn.q_a_layernorm.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.self_attn.q_a_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.19.self_attn.q_b_proj.weight": "model-00020-of-00048.safetensors",
+ "model.layers.20.input_layernorm.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.0.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.0.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.0.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.1.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.1.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.1.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.10.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.10.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.10.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.11.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.11.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.11.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.12.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.12.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.12.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.13.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.13.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.13.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.14.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.14.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.14.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.15.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.15.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.15.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.16.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.16.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.16.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.17.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.17.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.17.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.18.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.18.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.18.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.19.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.19.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.19.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.2.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.2.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.2.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.20.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.20.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.20.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.21.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.21.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.21.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.22.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.22.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.22.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.23.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.23.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.23.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.24.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.24.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.24.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.25.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.25.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.25.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.26.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.26.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.26.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.27.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.27.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.27.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.28.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.28.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.28.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.29.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.29.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.29.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.3.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.3.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.3.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.30.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.30.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.30.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.31.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.31.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.31.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.32.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.32.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.32.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.33.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.33.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.33.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.34.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.34.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.34.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.35.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.35.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.35.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.36.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.36.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.36.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.37.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.37.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.37.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.38.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.38.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.38.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.39.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.39.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.39.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.4.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.4.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.4.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.40.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.40.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.40.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.41.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.41.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.41.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.42.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.42.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.42.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.43.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.43.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.43.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.44.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.44.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.44.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.45.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.45.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.45.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.46.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.46.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.46.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.47.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.47.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.47.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.48.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.48.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.48.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.49.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.49.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.49.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.5.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.5.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.5.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.50.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.50.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.50.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.51.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.51.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.51.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.52.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.52.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.52.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.53.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.53.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.53.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.54.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.54.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.54.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.55.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.55.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.55.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.56.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.56.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.56.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.57.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.57.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.57.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.58.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.58.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.58.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.59.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.59.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.59.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.6.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.6.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.6.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.60.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.60.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.60.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.61.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.61.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.61.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.62.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.62.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.62.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.63.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.63.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.63.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.7.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.7.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.7.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.8.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.8.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.8.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.9.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.9.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.experts.9.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.gate.e_score_correction_bias": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.gate.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.shared_experts.down_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.shared_experts.gate_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.mlp.shared_experts.up_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.post_attention_layernorm.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.self_attn.kv_a_layernorm.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.self_attn.kv_a_proj_with_mqa.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.self_attn.kv_b_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.self_attn.o_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.self_attn.q_a_layernorm.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.self_attn.q_a_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.20.self_attn.q_b_proj.weight": "model-00021-of-00048.safetensors",
+ "model.layers.21.input_layernorm.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.0.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.0.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.0.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.1.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.1.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.1.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.10.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.10.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.10.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.11.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.11.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.11.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.12.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.12.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.12.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.13.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.13.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.13.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.14.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.14.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.14.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.15.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.15.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.15.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.16.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.16.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.16.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.17.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.17.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.17.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.18.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.18.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.18.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.19.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.19.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.19.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.2.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.2.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.2.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.20.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.20.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.20.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.21.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.21.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.21.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.22.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.22.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.22.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.23.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.23.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.23.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.24.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.24.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.24.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.25.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.25.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.25.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.26.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.26.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.26.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.27.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.27.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.27.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.28.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.28.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.28.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.29.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.29.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.29.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.3.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.3.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.3.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.30.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.30.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.30.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.31.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.31.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.31.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.32.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.32.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.32.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.33.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.33.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.33.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.34.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.34.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.34.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.35.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.35.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.35.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.36.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.36.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.36.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.37.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.37.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.37.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.38.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.38.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.38.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.39.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.39.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.39.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.4.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.4.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.4.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.40.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.40.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.40.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.41.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.41.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.41.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.42.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.42.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.42.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.43.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.43.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.43.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.44.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.44.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.44.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.45.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.45.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.45.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.46.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.46.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.46.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.47.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.47.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.47.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.48.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.48.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.48.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.49.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.49.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.49.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.5.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.5.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.5.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.50.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.50.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.50.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.51.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.51.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.51.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.52.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.52.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.52.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.53.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.53.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.53.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.54.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.54.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.54.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.55.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.55.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.55.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.56.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.56.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.56.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.57.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.57.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.57.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.58.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.58.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.58.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.59.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.59.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.59.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.6.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.6.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.6.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.60.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.60.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.60.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.61.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.61.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.61.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.62.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.62.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.62.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.63.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.63.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.63.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.7.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.7.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.7.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.8.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.8.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.8.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.9.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.9.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.experts.9.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.gate.e_score_correction_bias": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.gate.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.shared_experts.down_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.shared_experts.gate_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.mlp.shared_experts.up_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.post_attention_layernorm.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.self_attn.kv_a_layernorm.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.self_attn.kv_a_proj_with_mqa.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.self_attn.kv_b_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.self_attn.o_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.self_attn.q_a_layernorm.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.self_attn.q_a_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.21.self_attn.q_b_proj.weight": "model-00022-of-00048.safetensors",
+ "model.layers.22.input_layernorm.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.0.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.0.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.0.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.1.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.1.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.1.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.10.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.10.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.10.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.11.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.11.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.11.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.12.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.12.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.12.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.13.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.13.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.13.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.14.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.14.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.14.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.15.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.15.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.15.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.16.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.16.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.16.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.17.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.17.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.17.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.18.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.18.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.18.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.19.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.19.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.19.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.2.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.2.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.2.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.20.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.20.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.20.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.21.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.21.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.21.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.22.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.22.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.22.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.23.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.23.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.23.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.24.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.24.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.24.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.25.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.25.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.25.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.26.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.26.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.26.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.27.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.27.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.27.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.28.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.28.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.28.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.29.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.29.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.29.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.3.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.3.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.3.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.30.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.30.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.30.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.31.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.31.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.31.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.32.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.32.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.32.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.33.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.33.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.33.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.34.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.34.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.34.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.35.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.35.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.35.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.36.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.36.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.36.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.37.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.37.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.37.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.38.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.38.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.38.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.39.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.39.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.39.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.4.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.4.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.4.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.40.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.40.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.40.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.41.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.41.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.41.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.42.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.42.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.42.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.43.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.43.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.43.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.44.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.44.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.44.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.45.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.45.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.45.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.46.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.46.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.46.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.47.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.47.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.47.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.48.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.48.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.48.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.49.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.49.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.49.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.5.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.5.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.5.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.50.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.50.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.50.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.51.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.51.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.51.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.52.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.52.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.52.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.53.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.53.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.53.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.54.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.54.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.54.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.55.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.55.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.55.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.56.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.56.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.56.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.57.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.57.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.57.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.58.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.58.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.58.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.59.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.59.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.59.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.6.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.6.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.6.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.60.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.60.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.60.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.61.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.61.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.61.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.62.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.62.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.62.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.63.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.63.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.63.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.7.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.7.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.7.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.8.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.8.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.8.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.9.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.9.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.experts.9.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.gate.e_score_correction_bias": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.gate.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.shared_experts.down_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.shared_experts.gate_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.mlp.shared_experts.up_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.post_attention_layernorm.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.self_attn.kv_a_layernorm.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.self_attn.kv_a_proj_with_mqa.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.self_attn.kv_b_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.self_attn.o_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.self_attn.q_a_layernorm.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.self_attn.q_a_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.22.self_attn.q_b_proj.weight": "model-00023-of-00048.safetensors",
+ "model.layers.23.input_layernorm.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.0.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.0.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.0.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.1.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.1.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.1.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.10.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.10.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.10.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.11.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.11.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.11.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.12.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.12.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.12.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.13.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.13.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.13.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.14.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.14.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.14.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.15.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.15.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.15.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.16.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.16.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.16.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.17.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.17.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.17.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.18.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.18.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.18.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.19.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.19.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.19.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.2.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.2.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.2.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.20.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.20.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.20.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.21.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.21.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.21.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.22.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.22.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.22.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.23.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.23.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.23.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.24.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.24.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.24.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.25.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.25.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.25.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.26.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.26.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.26.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.27.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.27.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.27.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.28.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.28.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.28.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.29.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.29.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.29.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.3.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.3.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.3.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.30.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.30.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.30.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.31.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.31.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.31.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.32.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.32.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.32.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.33.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.33.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.33.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.34.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.34.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.34.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.35.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.35.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.35.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.36.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.36.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.36.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.37.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.37.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.37.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.38.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.38.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.38.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.39.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.39.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.39.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.4.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.4.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.4.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.40.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.40.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.40.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.41.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.41.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.41.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.42.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.42.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.42.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.43.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.43.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.43.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.44.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.44.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.44.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.45.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.45.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.45.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.46.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.46.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.46.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.47.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.47.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.47.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.48.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.48.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.48.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.49.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.49.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.49.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.5.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.5.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.5.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.50.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.50.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.50.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.51.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.51.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.51.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.52.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.52.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.52.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.53.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.53.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.53.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.54.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.54.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.54.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.55.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.55.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.55.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.56.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.56.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.56.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.57.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.57.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.57.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.58.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.58.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.58.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.59.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.59.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.59.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.6.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.6.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.6.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.60.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.60.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.60.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.61.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.61.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.61.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.62.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.62.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.62.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.63.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.63.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.63.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.7.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.7.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.7.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.8.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.8.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.8.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.9.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.9.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.experts.9.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.gate.e_score_correction_bias": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.gate.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.shared_experts.down_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.shared_experts.gate_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.mlp.shared_experts.up_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.post_attention_layernorm.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.self_attn.kv_a_layernorm.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.self_attn.kv_a_proj_with_mqa.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.self_attn.kv_b_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.self_attn.o_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.self_attn.q_a_layernorm.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.self_attn.q_a_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.23.self_attn.q_b_proj.weight": "model-00024-of-00048.safetensors",
+ "model.layers.24.input_layernorm.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.0.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.0.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.0.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.1.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.1.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.1.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.10.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.10.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.10.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.11.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.11.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.11.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.12.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.12.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.12.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.13.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.13.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.13.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.14.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.14.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.14.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.15.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.15.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.15.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.16.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.16.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.16.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.17.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.17.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.17.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.18.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.18.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.18.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.19.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.19.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.19.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.2.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.2.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.2.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.20.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.20.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.20.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.21.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.21.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.21.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.22.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.22.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.22.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.23.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.23.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.23.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.24.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.24.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.24.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.25.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.25.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.25.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.26.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.26.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.26.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.27.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.27.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.27.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.28.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.28.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.28.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.29.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.29.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.29.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.3.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.3.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.3.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.30.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.30.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.30.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.31.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.31.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.31.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.32.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.32.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.32.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.33.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.33.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.33.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.34.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.34.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.34.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.35.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.35.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.35.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.36.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.36.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.36.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.37.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.37.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.37.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.38.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.38.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.38.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.39.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.39.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.39.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.4.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.4.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.4.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.40.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.40.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.40.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.41.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.41.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.41.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.42.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.42.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.42.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.43.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.43.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.43.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.44.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.44.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.44.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.45.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.45.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.45.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.46.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.46.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.46.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.47.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.47.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.47.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.48.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.48.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.48.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.49.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.49.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.49.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.5.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.5.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.5.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.50.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.50.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.50.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.51.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.51.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.51.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.52.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.52.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.52.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.53.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.53.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.53.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.54.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.54.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.54.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.55.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.55.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.55.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.56.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.56.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.56.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.57.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.57.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.57.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.58.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.58.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.58.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.59.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.59.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.59.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.6.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.6.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.6.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.60.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.60.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.60.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.61.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.61.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.61.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.62.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.62.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.62.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.63.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.63.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.63.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.7.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.7.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.7.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.8.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.8.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.8.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.9.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.9.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.experts.9.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.gate.e_score_correction_bias": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.gate.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.shared_experts.down_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.shared_experts.gate_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.mlp.shared_experts.up_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.post_attention_layernorm.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.self_attn.kv_a_layernorm.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.self_attn.kv_a_proj_with_mqa.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.self_attn.kv_b_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.self_attn.o_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.self_attn.q_a_layernorm.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.self_attn.q_a_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.24.self_attn.q_b_proj.weight": "model-00025-of-00048.safetensors",
+ "model.layers.25.input_layernorm.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.0.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.0.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.0.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.1.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.1.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.1.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.10.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.10.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.10.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.11.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.11.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.11.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.12.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.12.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.12.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.13.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.13.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.13.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.14.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.14.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.14.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.15.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.15.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.15.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.16.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.16.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.16.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.17.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.17.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.17.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.18.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.18.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.18.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.19.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.19.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.19.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.2.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.2.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.2.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.20.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.20.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.20.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.21.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.21.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.21.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.22.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.22.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.22.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.23.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.23.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.23.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.24.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.24.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.24.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.25.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.25.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.25.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.26.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.26.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.26.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.27.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.27.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.27.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.28.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.28.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.28.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.29.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.29.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.29.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.3.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.3.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.3.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.30.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.30.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.30.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.31.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.31.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.31.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.32.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.32.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.32.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.33.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.33.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.33.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.34.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.34.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.34.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.35.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.35.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.35.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.36.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.36.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.36.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.37.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.37.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.37.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.38.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.38.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.38.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.39.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.39.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.39.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.4.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.4.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.4.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.40.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.40.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.40.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.41.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.41.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.41.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.42.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.42.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.42.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.43.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.43.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.43.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.44.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.44.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.44.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.45.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.45.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.45.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.46.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.46.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.46.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.47.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.47.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.47.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.48.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.48.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.48.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.49.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.49.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.49.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.5.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.5.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.5.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.50.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.50.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.50.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.51.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.51.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.51.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.52.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.52.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.52.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.53.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.53.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.53.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.54.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.54.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.54.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.55.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.55.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.55.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.56.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.56.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.56.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.57.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.57.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.57.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.58.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.58.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.58.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.59.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.59.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.59.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.6.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.6.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.6.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.60.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.60.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.60.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.61.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.61.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.61.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.62.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.62.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.62.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.63.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.63.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.63.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.7.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.7.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.7.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.8.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.8.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.8.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.9.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.9.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.experts.9.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.gate.e_score_correction_bias": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.gate.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.shared_experts.down_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.shared_experts.gate_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.mlp.shared_experts.up_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.post_attention_layernorm.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.self_attn.kv_a_layernorm.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.self_attn.kv_a_proj_with_mqa.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.self_attn.kv_b_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.self_attn.o_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.self_attn.q_a_layernorm.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.self_attn.q_a_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.25.self_attn.q_b_proj.weight": "model-00026-of-00048.safetensors",
+ "model.layers.26.input_layernorm.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.0.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.0.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.0.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.1.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.1.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.1.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.10.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.10.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.10.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.11.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.11.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.11.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.12.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.12.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.12.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.13.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.13.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.13.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.14.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.14.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.14.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.15.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.15.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.15.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.16.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.16.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.16.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.17.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.17.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.17.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.18.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.18.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.18.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.19.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.19.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.19.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.2.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.2.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.2.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.20.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.20.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.20.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.21.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.21.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.21.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.22.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.22.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.22.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.23.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.23.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.23.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.24.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.24.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.24.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.25.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.25.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.25.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.26.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.26.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.26.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.27.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.27.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.27.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.28.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.28.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.28.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.29.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.29.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.29.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.3.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.3.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.3.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.30.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.30.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.30.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.31.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.31.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.31.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.32.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.32.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.32.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.33.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.33.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.33.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.34.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.34.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.34.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.35.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.35.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.35.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.36.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.36.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.36.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.37.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.37.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.37.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.38.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.38.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.38.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.39.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.39.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.39.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.4.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.4.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.4.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.40.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.40.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.40.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.41.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.41.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.41.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.42.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.42.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.42.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.43.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.43.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.43.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.44.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.44.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.44.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.45.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.45.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.45.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.46.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.46.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.46.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.47.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.47.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.47.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.48.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.48.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.48.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.49.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.49.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.49.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.5.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.5.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.5.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.50.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.50.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.50.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.51.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.51.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.51.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.52.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.52.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.52.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.53.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.53.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.53.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.54.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.54.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.54.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.55.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.55.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.55.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.56.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.56.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.56.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.57.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.57.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.57.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.58.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.58.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.58.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.59.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.59.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.59.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.6.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.6.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.6.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.60.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.60.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.60.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.61.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.61.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.61.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.62.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.62.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.62.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.63.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.63.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.63.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.7.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.7.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.7.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.8.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.8.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.8.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.9.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.9.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.experts.9.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.gate.e_score_correction_bias": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.gate.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.shared_experts.down_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.shared_experts.gate_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.mlp.shared_experts.up_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.post_attention_layernorm.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.self_attn.kv_a_layernorm.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.self_attn.kv_a_proj_with_mqa.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.self_attn.kv_b_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.self_attn.o_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.self_attn.q_a_layernorm.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.self_attn.q_a_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.26.self_attn.q_b_proj.weight": "model-00027-of-00048.safetensors",
+ "model.layers.27.input_layernorm.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.0.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.0.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.0.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.1.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.1.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.1.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.10.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.10.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.10.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.11.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.11.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.11.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.12.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.12.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.12.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.13.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.13.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.13.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.14.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.14.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.14.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.15.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.15.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.15.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.16.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.16.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.16.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.17.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.17.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.17.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.18.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.18.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.18.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.19.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.19.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.19.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.2.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.2.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.2.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.20.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.20.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.20.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.21.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.21.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.21.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.22.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.22.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.22.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.23.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.23.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.23.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.24.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.24.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.24.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.25.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.25.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.25.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.26.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.26.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.26.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.27.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.27.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.27.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.28.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.28.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.28.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.29.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.29.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.29.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.3.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.3.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.3.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.30.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.30.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.30.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.31.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.31.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.31.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.32.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.32.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.32.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.33.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.33.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.33.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.34.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.34.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.34.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.35.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.35.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.35.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.36.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.36.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.36.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.37.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.37.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.37.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.38.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.38.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.38.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.39.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.39.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.39.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.4.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.4.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.4.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.40.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.40.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.40.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.41.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.41.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.41.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.42.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.42.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.42.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.43.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.43.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.43.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.44.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.44.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.44.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.45.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.45.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.45.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.46.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.46.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.46.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.47.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.47.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.47.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.48.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.48.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.48.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.49.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.49.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.49.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.5.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.5.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.5.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.50.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.50.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.50.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.51.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.51.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.51.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.52.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.52.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.52.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.53.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.53.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.53.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.54.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.54.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.54.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.55.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.55.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.55.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.56.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.56.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.56.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.57.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.57.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.57.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.58.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.58.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.58.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.59.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.59.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.59.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.6.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.6.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.6.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.60.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.60.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.60.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.61.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.61.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.61.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.62.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.62.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.62.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.63.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.63.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.63.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.7.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.7.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.7.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.8.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.8.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.8.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.9.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.9.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.experts.9.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.gate.e_score_correction_bias": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.gate.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.shared_experts.down_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.shared_experts.gate_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.mlp.shared_experts.up_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.post_attention_layernorm.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.self_attn.kv_a_layernorm.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.self_attn.kv_a_proj_with_mqa.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.self_attn.kv_b_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.self_attn.o_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.self_attn.q_a_layernorm.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.self_attn.q_a_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.27.self_attn.q_b_proj.weight": "model-00028-of-00048.safetensors",
+ "model.layers.28.input_layernorm.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.0.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.0.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.0.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.1.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.1.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.1.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.10.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.10.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.10.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.11.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.11.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.11.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.12.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.12.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.12.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.13.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.13.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.13.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.14.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.14.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.14.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.15.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.15.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.15.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.16.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.16.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.16.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.17.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.17.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.17.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.18.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.18.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.18.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.19.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.19.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.19.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.2.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.2.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.2.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.20.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.20.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.20.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.21.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.21.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.21.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.22.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.22.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.22.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.23.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.23.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.23.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.24.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.24.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.24.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.25.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.25.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.25.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.26.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.26.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.26.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.27.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.27.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.27.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.28.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.28.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.28.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.29.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.29.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.29.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.3.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.3.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.3.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.30.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.30.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.30.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.31.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.31.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.31.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.32.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.32.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.32.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.33.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.33.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.33.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.34.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.34.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.34.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.35.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.35.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.35.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.36.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.36.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.36.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.37.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.37.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.37.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.38.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.38.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.38.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.39.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.39.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.39.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.4.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.4.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.4.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.40.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.40.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.40.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.41.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.41.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.41.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.42.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.42.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.42.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.43.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.43.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.43.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.44.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.44.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.44.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.45.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.45.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.45.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.46.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.46.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.46.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.47.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.47.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.47.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.48.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.48.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.48.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.49.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.49.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.49.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.5.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.5.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.5.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.50.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.50.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.50.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.51.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.51.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.51.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.52.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.52.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.52.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.53.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.53.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.53.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.54.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.54.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.54.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.55.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.55.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.55.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.56.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.56.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.56.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.57.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.57.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.57.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.58.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.58.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.58.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.59.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.59.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.59.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.6.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.6.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.6.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.60.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.60.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.60.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.61.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.61.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.61.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.62.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.62.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.62.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.63.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.63.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.63.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.7.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.7.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.7.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.8.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.8.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.8.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.9.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.9.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.experts.9.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.gate.e_score_correction_bias": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.gate.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.shared_experts.down_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.shared_experts.gate_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.mlp.shared_experts.up_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.post_attention_layernorm.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.self_attn.kv_a_layernorm.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.self_attn.kv_a_proj_with_mqa.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.self_attn.kv_b_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.self_attn.o_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.self_attn.q_a_layernorm.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.self_attn.q_a_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.28.self_attn.q_b_proj.weight": "model-00029-of-00048.safetensors",
+ "model.layers.29.input_layernorm.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.0.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.0.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.0.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.1.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.1.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.1.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.10.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.10.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.10.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.11.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.11.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.11.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.12.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.12.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.12.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.13.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.13.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.13.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.14.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.14.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.14.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.15.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.15.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.15.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.16.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.16.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.16.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.17.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.17.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.17.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.18.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.18.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.18.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.19.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.19.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.19.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.2.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.2.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.2.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.20.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.20.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.20.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.21.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.21.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.21.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.22.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.22.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.22.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.23.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.23.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.23.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.24.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.24.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.24.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.25.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.25.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.25.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.26.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.26.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.26.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.27.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.27.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.27.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.28.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.28.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.28.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.29.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.29.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.29.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.3.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.3.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.3.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.30.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.30.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.30.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.31.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.31.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.31.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.32.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.32.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.32.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.33.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.33.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.33.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.34.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.34.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.34.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.35.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.35.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.35.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.36.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.36.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.36.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.37.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.37.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.37.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.38.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.38.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.38.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.39.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.39.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.39.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.4.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.4.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.4.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.40.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.40.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.40.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.41.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.41.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.41.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.42.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.42.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.42.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.43.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.43.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.43.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.44.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.44.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.44.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.45.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.45.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.45.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.46.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.46.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.46.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.47.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.47.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.47.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.48.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.48.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.48.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.49.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.49.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.49.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.5.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.5.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.5.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.50.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.50.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.50.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.51.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.51.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.51.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.52.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.52.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.52.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.53.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.53.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.53.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.54.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.54.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.54.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.55.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.55.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.55.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.56.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.56.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.56.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.57.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.57.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.57.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.58.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.58.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.58.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.59.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.59.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.59.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.6.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.6.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.6.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.60.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.60.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.60.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.61.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.61.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.61.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.62.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.62.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.62.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.63.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.63.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.63.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.7.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.7.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.7.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.8.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.8.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.8.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.9.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.9.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.experts.9.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.gate.e_score_correction_bias": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.gate.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.shared_experts.down_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.shared_experts.gate_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.mlp.shared_experts.up_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.post_attention_layernorm.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.self_attn.kv_a_layernorm.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.self_attn.kv_a_proj_with_mqa.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.self_attn.kv_b_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.self_attn.o_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.self_attn.q_a_layernorm.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.self_attn.q_a_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.29.self_attn.q_b_proj.weight": "model-00030-of-00048.safetensors",
+ "model.layers.30.input_layernorm.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.0.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.0.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.0.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.1.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.1.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.1.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.10.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.10.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.10.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.11.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.11.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.11.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.12.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.12.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.12.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.13.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.13.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.13.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.14.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.14.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.14.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.15.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.15.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.15.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.16.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.16.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.16.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.17.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.17.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.17.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.18.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.18.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.18.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.19.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.19.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.19.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.2.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.2.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.2.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.20.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.20.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.20.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.21.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.21.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.21.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.22.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.22.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.22.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.23.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.23.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.23.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.24.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.24.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.24.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.25.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.25.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.25.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.26.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.26.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.26.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.27.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.27.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.27.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.28.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.28.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.28.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.29.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.29.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.29.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.3.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.3.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.3.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.30.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.30.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.30.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.31.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.31.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.31.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.32.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.32.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.32.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.33.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.33.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.33.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.34.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.34.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.34.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.35.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.35.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.35.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.36.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.36.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.36.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.37.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.37.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.37.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.38.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.38.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.38.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.39.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.39.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.39.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.4.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.4.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.4.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.40.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.40.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.40.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.41.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.41.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.41.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.42.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.42.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.42.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.43.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.43.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.43.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.44.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.44.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.44.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.45.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.45.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.45.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.46.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.46.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.46.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.47.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.47.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.47.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.48.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.48.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.48.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.49.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.49.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.49.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.5.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.5.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.5.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.50.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.50.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.50.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.51.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.51.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.51.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.52.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.52.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.52.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.53.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.53.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.53.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.54.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.54.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.54.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.55.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.55.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.55.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.56.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.56.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.56.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.57.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.57.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.57.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.58.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.58.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.58.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.59.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.59.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.59.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.6.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.6.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.6.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.60.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.60.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.60.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.61.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.61.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.61.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.62.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.62.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.62.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.63.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.63.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.63.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.7.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.7.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.7.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.8.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.8.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.8.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.9.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.9.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.experts.9.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.gate.e_score_correction_bias": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.gate.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.shared_experts.down_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.shared_experts.gate_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.mlp.shared_experts.up_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.post_attention_layernorm.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.self_attn.kv_a_layernorm.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.self_attn.kv_a_proj_with_mqa.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.self_attn.kv_b_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.self_attn.o_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.self_attn.q_a_layernorm.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.self_attn.q_a_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.30.self_attn.q_b_proj.weight": "model-00031-of-00048.safetensors",
+ "model.layers.31.input_layernorm.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.0.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.0.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.0.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.1.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.1.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.1.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.10.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.10.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.10.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.11.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.11.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.11.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.12.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.12.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.12.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.13.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.13.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.13.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.14.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.14.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.14.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.15.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.15.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.15.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.16.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.16.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.16.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.17.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.17.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.17.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.18.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.18.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.18.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.19.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.19.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.19.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.2.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.2.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.2.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.20.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.20.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.20.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.21.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.21.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.21.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.22.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.22.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.22.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.23.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.23.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.23.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.24.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.24.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.24.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.25.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.25.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.25.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.26.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.26.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.26.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.27.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.27.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.27.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.28.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.28.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.28.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.29.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.29.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.29.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.3.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.3.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.3.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.30.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.30.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.30.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.31.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.31.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.31.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.32.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.32.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.32.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.33.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.33.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.33.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.34.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.34.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.34.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.35.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.35.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.35.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.36.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.36.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.36.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.37.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.37.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.37.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.38.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.38.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.38.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.39.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.39.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.39.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.4.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.4.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.4.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.40.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.40.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.40.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.41.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.41.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.41.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.42.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.42.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.42.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.43.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.43.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.43.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.44.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.44.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.44.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.45.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.45.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.45.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.46.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.46.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.46.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.47.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.47.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.47.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.48.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.48.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.48.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.49.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.49.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.49.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.5.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.5.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.5.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.50.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.50.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.50.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.51.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.51.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.51.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.52.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.52.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.52.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.53.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.53.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.53.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.54.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.54.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.54.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.55.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.55.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.55.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.56.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.56.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.56.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.57.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.57.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.57.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.58.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.58.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.58.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.59.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.59.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.59.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.6.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.6.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.6.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.60.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.60.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.60.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.61.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.61.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.61.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.62.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.62.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.62.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.63.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.63.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.63.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.7.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.7.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.7.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.8.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.8.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.8.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.9.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.9.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.experts.9.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.gate.e_score_correction_bias": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.gate.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.shared_experts.down_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.shared_experts.gate_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.mlp.shared_experts.up_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.post_attention_layernorm.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.self_attn.kv_a_layernorm.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.self_attn.kv_a_proj_with_mqa.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.self_attn.kv_b_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.self_attn.o_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.self_attn.q_a_layernorm.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.self_attn.q_a_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.31.self_attn.q_b_proj.weight": "model-00032-of-00048.safetensors",
+ "model.layers.32.input_layernorm.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.0.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.0.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.0.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.1.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.1.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.1.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.10.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.10.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.10.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.11.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.11.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.11.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.12.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.12.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.12.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.13.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.13.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.13.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.14.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.14.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.14.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.15.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.15.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.15.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.16.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.16.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.16.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.17.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.17.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.17.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.18.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.18.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.18.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.19.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.19.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.19.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.2.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.2.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.2.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.20.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.20.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.20.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.21.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.21.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.21.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.22.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.22.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.22.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.23.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.23.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.23.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.24.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.24.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.24.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.25.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.25.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.25.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.26.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.26.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.26.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.27.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.27.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.27.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.28.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.28.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.28.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.29.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.29.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.29.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.3.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.3.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.3.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.30.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.30.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.30.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.31.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.31.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.31.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.32.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.32.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.32.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.33.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.33.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.33.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.34.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.34.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.34.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.35.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.35.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.35.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.36.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.36.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.36.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.37.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.37.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.37.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.38.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.38.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.38.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.39.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.39.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.39.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.4.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.4.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.4.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.40.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.40.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.40.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.41.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.41.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.41.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.42.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.42.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.42.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.43.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.43.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.43.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.44.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.44.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.44.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.45.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.45.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.45.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.46.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.46.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.46.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.47.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.47.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.47.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.48.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.48.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.48.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.49.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.49.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.49.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.5.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.5.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.5.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.50.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.50.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.50.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.51.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.51.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.51.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.52.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.52.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.52.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.53.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.53.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.53.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.54.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.54.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.54.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.55.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.55.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.55.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.56.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.56.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.56.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.57.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.57.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.57.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.58.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.58.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.58.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.59.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.59.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.59.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.6.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.6.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.6.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.60.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.60.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.60.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.61.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.61.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.61.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.62.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.62.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.62.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.63.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.63.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.63.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.7.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.7.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.7.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.8.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.8.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.8.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.9.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.9.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.experts.9.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.gate.e_score_correction_bias": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.gate.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.shared_experts.down_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.shared_experts.gate_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.mlp.shared_experts.up_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.post_attention_layernorm.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.self_attn.kv_a_layernorm.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.self_attn.kv_a_proj_with_mqa.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.self_attn.kv_b_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.self_attn.o_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.self_attn.q_a_layernorm.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.self_attn.q_a_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.32.self_attn.q_b_proj.weight": "model-00033-of-00048.safetensors",
+ "model.layers.33.input_layernorm.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.0.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.0.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.0.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.1.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.1.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.1.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.10.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.10.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.10.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.11.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.11.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.11.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.12.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.12.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.12.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.13.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.13.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.13.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.14.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.14.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.14.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.15.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.15.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.15.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.16.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.16.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.16.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.17.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.17.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.17.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.18.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.18.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.18.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.19.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.19.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.19.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.2.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.2.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.2.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.20.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.20.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.20.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.21.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.21.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.21.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.22.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.22.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.22.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.23.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.23.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.23.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.24.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.24.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.24.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.25.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.25.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.25.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.26.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.26.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.26.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.27.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.27.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.27.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.28.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.28.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.28.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.29.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.29.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.29.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.3.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.3.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.3.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.30.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.30.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.30.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.31.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.31.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.31.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.32.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.32.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.32.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.33.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.33.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.33.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.34.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.34.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.34.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.35.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.35.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.35.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.36.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.36.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.36.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.37.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.37.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.37.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.38.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.38.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.38.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.39.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.39.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.39.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.4.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.4.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.4.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.40.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.40.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.40.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.41.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.41.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.41.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.42.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.42.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.42.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.43.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.43.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.43.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.44.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.44.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.44.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.45.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.45.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.45.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.46.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.46.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.46.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.47.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.47.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.47.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.48.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.48.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.48.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.49.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.49.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.49.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.5.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.5.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.5.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.50.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.50.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.50.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.51.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.51.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.51.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.52.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.52.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.52.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.53.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.53.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.53.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.54.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.54.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.54.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.55.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.55.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.55.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.56.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.56.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.56.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.57.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.57.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.57.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.58.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.58.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.58.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.59.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.59.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.59.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.6.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.6.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.6.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.60.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.60.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.60.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.61.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.61.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.61.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.62.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.62.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.62.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.63.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.63.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.63.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.7.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.7.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.7.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.8.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.8.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.8.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.9.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.9.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.experts.9.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.gate.e_score_correction_bias": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.gate.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.shared_experts.down_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.shared_experts.gate_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.mlp.shared_experts.up_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.post_attention_layernorm.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.self_attn.kv_a_layernorm.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.self_attn.kv_a_proj_with_mqa.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.self_attn.kv_b_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.self_attn.o_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.self_attn.q_a_layernorm.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.self_attn.q_a_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.33.self_attn.q_b_proj.weight": "model-00034-of-00048.safetensors",
+ "model.layers.34.input_layernorm.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.0.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.0.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.0.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.1.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.1.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.1.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.10.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.10.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.10.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.11.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.11.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.11.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.12.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.12.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.12.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.13.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.13.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.13.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.14.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.14.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.14.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.15.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.15.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.15.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.16.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.16.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.16.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.17.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.17.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.17.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.18.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.18.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.18.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.19.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.19.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.19.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.2.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.2.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.2.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.20.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.20.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.20.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.21.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.21.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.21.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.22.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.22.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.22.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.23.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.23.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.23.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.24.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.24.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.24.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.25.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.25.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.25.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.26.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.26.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.26.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.27.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.27.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.27.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.28.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.28.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.28.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.29.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.29.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.29.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.3.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.3.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.3.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.30.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.30.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.30.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.31.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.31.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.31.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.32.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.32.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.32.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.33.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.33.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.33.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.34.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.34.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.34.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.35.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.35.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.35.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.36.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.36.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.36.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.37.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.37.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.37.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.38.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.38.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.38.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.39.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.39.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.39.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.4.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.4.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.4.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.40.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.40.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.40.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.41.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.41.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.41.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.42.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.42.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.42.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.43.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.43.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.43.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.44.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.44.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.44.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.45.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.45.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.45.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.46.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.46.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.46.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.47.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.47.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.47.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.48.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.48.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.48.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.49.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.49.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.49.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.5.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.5.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.5.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.50.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.50.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.50.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.51.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.51.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.51.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.52.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.52.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.52.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.53.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.53.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.53.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.54.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.54.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.54.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.55.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.55.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.55.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.56.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.56.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.56.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.57.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.57.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.57.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.58.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.58.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.58.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.59.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.59.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.59.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.6.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.6.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.6.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.60.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.60.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.60.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.61.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.61.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.61.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.62.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.62.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.62.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.63.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.63.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.63.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.7.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.7.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.7.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.8.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.8.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.8.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.9.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.9.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.experts.9.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.gate.e_score_correction_bias": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.gate.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.shared_experts.down_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.shared_experts.gate_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.mlp.shared_experts.up_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.post_attention_layernorm.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.self_attn.kv_a_layernorm.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.self_attn.kv_a_proj_with_mqa.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.self_attn.kv_b_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.self_attn.o_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.self_attn.q_a_layernorm.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.self_attn.q_a_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.34.self_attn.q_b_proj.weight": "model-00035-of-00048.safetensors",
+ "model.layers.35.input_layernorm.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.0.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.0.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.0.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.1.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.1.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.1.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.10.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.10.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.10.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.11.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.11.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.11.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.12.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.12.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.12.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.13.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.13.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.13.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.14.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.14.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.14.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.15.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.15.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.15.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.16.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.16.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.16.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.17.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.17.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.17.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.18.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.18.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.18.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.19.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.19.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.19.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.2.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.2.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.2.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.20.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.20.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.20.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.21.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.21.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.21.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.22.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.22.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.22.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.23.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.23.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.23.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.24.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.24.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.24.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.25.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.25.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.25.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.26.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.26.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.26.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.27.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.27.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.27.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.28.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.28.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.28.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.29.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.29.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.29.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.3.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.3.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.3.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.30.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.30.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.30.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.31.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.31.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.31.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.32.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.32.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.32.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.33.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.33.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.33.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.34.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.34.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.34.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.35.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.35.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.35.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.36.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.36.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.36.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.37.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.37.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.37.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.38.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.38.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.38.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.39.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.39.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.39.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.4.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.4.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.4.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.40.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.40.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.40.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.41.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.41.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.41.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.42.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.42.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.42.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.43.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.43.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.43.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.44.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.44.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.44.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.45.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.45.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.45.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.46.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.46.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.46.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.47.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.47.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.47.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.48.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.48.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.48.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.49.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.49.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.49.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.5.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.5.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.5.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.50.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.50.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.50.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.51.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.51.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.51.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.52.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.52.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.52.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.53.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.53.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.53.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.54.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.54.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.54.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.55.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.55.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.55.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.56.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.56.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.56.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.57.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.57.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.57.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.58.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.58.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.58.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.59.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.59.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.59.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.6.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.6.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.6.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.60.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.60.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.60.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.61.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.61.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.61.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.62.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.62.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.62.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.63.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.63.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.63.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.7.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.7.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.7.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.8.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.8.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.8.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.9.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.9.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.experts.9.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.gate.e_score_correction_bias": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.gate.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.shared_experts.down_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.shared_experts.gate_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.mlp.shared_experts.up_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.post_attention_layernorm.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.self_attn.kv_a_layernorm.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.self_attn.kv_a_proj_with_mqa.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.self_attn.kv_b_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.self_attn.o_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.self_attn.q_a_layernorm.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.self_attn.q_a_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.35.self_attn.q_b_proj.weight": "model-00036-of-00048.safetensors",
+ "model.layers.36.input_layernorm.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.0.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.0.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.0.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.1.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.1.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.1.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.10.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.10.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.10.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.11.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.11.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.11.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.12.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.12.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.12.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.13.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.13.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.13.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.14.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.14.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.14.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.15.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.15.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.15.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.16.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.16.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.16.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.17.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.17.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.17.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.18.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.18.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.18.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.19.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.19.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.19.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.2.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.2.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.2.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.20.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.20.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.20.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.21.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.21.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.21.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.22.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.22.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.22.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.23.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.23.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.23.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.24.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.24.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.24.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.25.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.25.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.25.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.26.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.26.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.26.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.27.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.27.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.27.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.28.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.28.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.28.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.29.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.29.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.29.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.3.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.3.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.3.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.30.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.30.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.30.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.31.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.31.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.31.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.32.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.32.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.32.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.33.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.33.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.33.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.34.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.34.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.34.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.35.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.35.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.35.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.36.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.36.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.36.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.37.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.37.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.37.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.38.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.38.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.38.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.39.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.39.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.39.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.4.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.4.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.4.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.40.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.40.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.40.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.41.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.41.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.41.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.42.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.42.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.42.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.43.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.43.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.43.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.44.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.44.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.44.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.45.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.45.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.45.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.46.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.46.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.46.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.47.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.47.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.47.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.48.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.48.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.48.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.49.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.49.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.49.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.5.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.5.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.5.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.50.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.50.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.50.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.51.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.51.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.51.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.52.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.52.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.52.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.53.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.53.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.53.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.54.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.54.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.54.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.55.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.55.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.55.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.56.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.56.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.56.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.57.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.57.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.57.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.58.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.58.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.58.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.59.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.59.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.59.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.6.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.6.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.6.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.60.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.60.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.60.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.61.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.61.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.61.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.62.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.62.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.62.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.63.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.63.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.63.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.7.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.7.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.7.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.8.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.8.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.8.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.9.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.9.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.experts.9.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.gate.e_score_correction_bias": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.gate.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.shared_experts.down_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.shared_experts.gate_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.mlp.shared_experts.up_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.post_attention_layernorm.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.self_attn.kv_a_layernorm.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.self_attn.kv_a_proj_with_mqa.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.self_attn.kv_b_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.self_attn.o_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.self_attn.q_a_layernorm.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.self_attn.q_a_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.36.self_attn.q_b_proj.weight": "model-00037-of-00048.safetensors",
+ "model.layers.37.input_layernorm.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.0.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.0.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.0.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.1.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.1.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.1.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.10.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.10.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.10.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.11.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.11.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.11.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.12.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.12.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.12.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.13.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.13.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.13.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.14.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.14.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.14.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.15.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.15.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.15.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.16.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.16.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.16.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.17.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.17.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.17.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.18.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.18.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.18.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.19.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.19.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.19.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.2.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.2.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.2.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.20.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.20.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.20.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.21.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.21.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.21.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.22.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.22.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.22.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.23.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.23.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.23.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.24.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.24.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.24.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.25.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.25.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.25.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.26.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.26.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.26.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.27.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.27.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.27.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.28.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.28.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.28.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.29.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.29.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.29.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.3.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.3.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.3.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.30.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.30.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.30.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.31.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.31.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.31.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.32.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.32.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.32.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.33.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.33.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.33.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.34.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.34.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.34.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.35.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.35.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.35.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.36.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.36.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.36.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.37.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.37.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.37.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.38.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.38.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.38.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.39.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.39.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.39.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.4.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.4.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.4.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.40.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.40.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.40.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.41.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.41.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.41.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.42.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.42.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.42.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.43.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.43.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.43.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.44.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.44.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.44.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.45.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.45.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.45.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.46.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.46.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.46.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.47.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.47.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.47.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.48.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.48.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.48.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.49.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.49.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.49.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.5.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.5.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.5.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.50.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.50.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.50.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.51.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.51.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.51.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.52.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.52.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.52.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.53.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.53.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.53.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.54.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.54.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.54.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.55.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.55.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.55.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.56.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.56.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.56.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.57.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.57.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.57.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.58.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.58.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.58.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.59.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.59.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.59.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.6.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.6.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.6.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.60.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.60.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.60.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.61.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.61.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.61.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.62.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.62.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.62.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.63.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.63.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.63.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.7.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.7.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.7.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.8.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.8.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.8.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.9.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.9.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.experts.9.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.gate.e_score_correction_bias": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.gate.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.shared_experts.down_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.shared_experts.gate_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.mlp.shared_experts.up_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.post_attention_layernorm.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.self_attn.kv_a_layernorm.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.self_attn.kv_a_proj_with_mqa.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.self_attn.kv_b_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.self_attn.o_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.self_attn.q_a_layernorm.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.self_attn.q_a_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.37.self_attn.q_b_proj.weight": "model-00038-of-00048.safetensors",
+ "model.layers.38.input_layernorm.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.0.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.0.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.0.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.1.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.1.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.1.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.10.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.10.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.10.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.11.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.11.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.11.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.12.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.12.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.12.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.13.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.13.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.13.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.14.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.14.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.14.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.15.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.15.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.15.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.16.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.16.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.16.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.17.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.17.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.17.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.18.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.18.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.18.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.19.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.19.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.19.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.2.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.2.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.2.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.20.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.20.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.20.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.21.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.21.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.21.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.22.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.22.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.22.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.23.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.23.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.23.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.24.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.24.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.24.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.25.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.25.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.25.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.26.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.26.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.26.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.27.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.27.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.27.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.28.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.28.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.28.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.29.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.29.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.29.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.3.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.3.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.3.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.30.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.30.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.30.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.31.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.31.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.31.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.32.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.32.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.32.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.33.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.33.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.33.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.34.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.34.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.34.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.35.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.35.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.35.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.36.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.36.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.36.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.37.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.37.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.37.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.38.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.38.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.38.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.39.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.39.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.39.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.4.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.4.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.4.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.40.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.40.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.40.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.41.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.41.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.41.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.42.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.42.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.42.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.43.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.43.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.43.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.44.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.44.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.44.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.45.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.45.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.45.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.46.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.46.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.46.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.47.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.47.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.47.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.48.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.48.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.48.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.49.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.49.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.49.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.5.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.5.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.5.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.50.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.50.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.50.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.51.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.51.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.51.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.52.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.52.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.52.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.53.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.53.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.53.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.54.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.54.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.54.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.55.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.55.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.55.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.56.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.56.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.56.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.57.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.57.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.57.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.58.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.58.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.58.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.59.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.59.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.59.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.6.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.6.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.6.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.60.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.60.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.60.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.61.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.61.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.61.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.62.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.62.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.62.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.63.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.63.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.63.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.7.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.7.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.7.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.8.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.8.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.8.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.9.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.9.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.experts.9.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.gate.e_score_correction_bias": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.gate.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.shared_experts.down_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.shared_experts.gate_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.mlp.shared_experts.up_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.post_attention_layernorm.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.self_attn.kv_a_layernorm.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.self_attn.kv_a_proj_with_mqa.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.self_attn.kv_b_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.self_attn.o_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.self_attn.q_a_layernorm.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.self_attn.q_a_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.38.self_attn.q_b_proj.weight": "model-00039-of-00048.safetensors",
+ "model.layers.39.input_layernorm.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.0.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.0.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.0.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.1.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.1.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.1.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.10.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.10.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.10.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.11.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.11.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.11.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.12.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.12.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.12.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.13.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.13.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.13.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.14.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.14.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.14.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.15.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.15.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.15.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.16.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.16.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.16.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.17.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.17.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.17.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.18.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.18.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.18.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.19.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.19.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.19.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.2.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.2.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.2.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.20.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.20.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.20.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.21.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.21.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.21.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.22.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.22.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.22.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.23.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.23.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.23.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.24.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.24.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.24.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.25.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.25.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.25.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.26.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.26.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.26.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.27.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.27.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.27.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.28.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.28.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.28.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.29.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.29.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.29.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.3.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.3.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.3.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.30.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.30.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.30.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.31.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.31.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.31.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.32.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.32.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.32.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.33.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.33.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.33.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.34.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.34.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.34.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.35.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.35.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.35.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.36.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.36.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.36.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.37.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.37.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.37.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.38.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.38.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.38.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.39.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.39.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.39.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.4.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.4.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.4.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.40.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.40.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.40.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.41.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.41.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.41.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.42.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.42.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.42.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.43.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.43.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.43.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.44.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.44.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.44.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.45.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.45.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.45.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.46.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.46.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.46.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.47.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.47.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.47.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.48.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.48.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.48.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.49.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.49.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.49.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.5.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.5.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.5.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.50.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.50.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.50.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.51.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.51.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.51.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.52.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.52.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.52.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.53.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.53.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.53.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.54.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.54.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.54.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.55.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.55.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.55.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.56.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.56.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.56.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.57.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.57.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.57.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.58.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.58.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.58.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.59.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.59.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.59.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.6.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.6.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.6.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.60.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.60.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.60.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.61.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.61.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.61.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.62.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.62.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.62.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.63.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.63.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.63.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.7.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.7.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.7.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.8.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.8.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.8.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.9.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.9.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.experts.9.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.gate.e_score_correction_bias": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.gate.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.shared_experts.down_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.shared_experts.gate_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.mlp.shared_experts.up_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.post_attention_layernorm.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.self_attn.kv_a_layernorm.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.self_attn.kv_a_proj_with_mqa.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.self_attn.kv_b_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.self_attn.o_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.self_attn.q_a_layernorm.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.self_attn.q_a_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.39.self_attn.q_b_proj.weight": "model-00040-of-00048.safetensors",
+ "model.layers.40.input_layernorm.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.0.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.0.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.0.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.1.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.1.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.1.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.10.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.10.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.10.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.11.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.11.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.11.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.12.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.12.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.12.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.13.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.13.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.13.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.14.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.14.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.14.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.15.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.15.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.15.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.16.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.16.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.16.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.17.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.17.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.17.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.18.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.18.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.18.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.19.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.19.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.19.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.2.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.2.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.2.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.20.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.20.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.20.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.21.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.21.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.21.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.22.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.22.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.22.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.23.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.23.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.23.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.24.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.24.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.24.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.25.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.25.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.25.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.26.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.26.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.26.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.27.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.27.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.27.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.28.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.28.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.28.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.29.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.29.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.29.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.3.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.3.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.3.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.30.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.30.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.30.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.31.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.31.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.31.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.32.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.32.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.32.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.33.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.33.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.33.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.34.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.34.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.34.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.35.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.35.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.35.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.36.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.36.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.36.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.37.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.37.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.37.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.38.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.38.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.38.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.39.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.39.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.39.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.4.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.4.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.4.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.40.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.40.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.40.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.41.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.41.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.41.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.42.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.42.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.42.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.43.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.43.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.43.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.44.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.44.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.44.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.45.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.45.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.45.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.46.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.46.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.46.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.47.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.47.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.47.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.48.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.48.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.48.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.49.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.49.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.49.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.5.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.5.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.5.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.50.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.50.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.50.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.51.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.51.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.51.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.52.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.52.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.52.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.53.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.53.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.53.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.54.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.54.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.54.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.55.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.55.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.55.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.56.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.56.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.56.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.57.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.57.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.57.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.58.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.58.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.58.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.59.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.59.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.59.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.6.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.6.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.6.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.60.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.60.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.60.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.61.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.61.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.61.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.62.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.62.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.62.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.63.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.63.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.63.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.7.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.7.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.7.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.8.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.8.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.8.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.9.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.9.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.experts.9.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.gate.e_score_correction_bias": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.gate.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.shared_experts.down_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.shared_experts.gate_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.mlp.shared_experts.up_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.post_attention_layernorm.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.self_attn.kv_a_layernorm.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.self_attn.kv_a_proj_with_mqa.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.self_attn.kv_b_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.self_attn.o_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.self_attn.q_a_layernorm.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.self_attn.q_a_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.40.self_attn.q_b_proj.weight": "model-00041-of-00048.safetensors",
+ "model.layers.41.input_layernorm.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.0.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.0.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.0.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.1.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.1.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.1.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.10.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.10.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.10.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.11.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.11.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.11.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.12.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.12.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.12.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.13.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.13.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.13.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.14.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.14.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.14.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.15.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.15.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.15.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.16.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.16.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.16.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.17.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.17.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.17.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.18.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.18.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.18.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.19.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.19.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.19.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.2.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.2.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.2.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.20.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.20.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.20.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.21.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.21.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.21.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.22.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.22.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.22.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.23.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.23.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.23.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.24.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.24.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.24.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.25.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.25.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.25.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.26.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.26.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.26.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.27.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.27.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.27.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.28.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.28.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.28.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.29.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.29.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.29.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.3.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.3.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.3.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.30.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.30.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.30.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.31.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.31.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.31.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.32.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.32.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.32.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.33.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.33.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.33.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.34.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.34.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.34.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.35.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.35.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.35.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.36.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.36.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.36.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.37.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.37.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.37.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.38.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.38.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.38.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.39.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.39.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.39.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.4.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.4.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.4.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.40.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.40.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.40.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.41.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.41.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.41.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.42.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.42.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.42.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.43.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.43.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.43.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.44.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.44.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.44.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.45.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.45.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.45.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.46.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.46.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.46.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.47.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.47.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.47.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.48.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.48.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.48.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.49.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.49.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.49.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.5.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.5.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.5.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.50.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.50.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.50.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.51.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.51.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.51.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.52.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.52.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.52.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.53.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.53.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.53.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.54.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.54.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.54.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.55.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.55.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.55.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.56.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.56.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.56.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.57.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.57.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.57.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.58.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.58.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.58.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.59.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.59.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.59.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.6.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.6.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.6.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.60.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.60.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.60.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.61.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.61.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.61.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.62.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.62.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.62.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.63.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.63.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.63.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.7.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.7.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.7.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.8.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.8.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.8.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.9.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.9.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.experts.9.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.gate.e_score_correction_bias": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.gate.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.shared_experts.down_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.shared_experts.gate_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.mlp.shared_experts.up_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.post_attention_layernorm.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.self_attn.kv_a_layernorm.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.self_attn.kv_a_proj_with_mqa.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.self_attn.kv_b_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.self_attn.o_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.self_attn.q_a_layernorm.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.self_attn.q_a_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.41.self_attn.q_b_proj.weight": "model-00042-of-00048.safetensors",
+ "model.layers.42.input_layernorm.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.0.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.0.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.0.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.1.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.1.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.1.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.10.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.10.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.10.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.11.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.11.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.11.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.12.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.12.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.12.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.13.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.13.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.13.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.14.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.14.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.14.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.15.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.15.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.15.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.16.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.16.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.16.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.17.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.17.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.17.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.18.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.18.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.18.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.19.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.19.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.19.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.2.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.2.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.2.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.20.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.20.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.20.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.21.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.21.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.21.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.22.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.22.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.22.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.23.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.23.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.23.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.24.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.24.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.24.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.25.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.25.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.25.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.26.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.26.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.26.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.27.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.27.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.27.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.28.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.28.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.28.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.29.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.29.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.29.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.3.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.3.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.3.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.30.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.30.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.30.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.31.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.31.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.31.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.32.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.32.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.32.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.33.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.33.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.33.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.34.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.34.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.34.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.35.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.35.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.35.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.36.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.36.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.36.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.37.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.37.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.37.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.38.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.38.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.38.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.39.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.39.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.39.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.4.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.4.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.4.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.40.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.40.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.40.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.41.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.41.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.41.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.42.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.42.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.42.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.43.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.43.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.43.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.44.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.44.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.44.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.45.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.45.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.45.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.46.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.46.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.46.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.47.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.47.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.47.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.48.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.48.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.48.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.49.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.49.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.49.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.5.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.5.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.5.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.50.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.50.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.50.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.51.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.51.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.51.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.52.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.52.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.52.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.53.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.53.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.53.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.54.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.54.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.54.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.55.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.55.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.55.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.56.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.56.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.56.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.57.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.57.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.57.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.58.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.58.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.58.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.59.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.59.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.59.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.6.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.6.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.6.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.60.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.60.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.60.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.61.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.61.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.61.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.62.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.62.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.62.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.63.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.63.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.63.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.7.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.7.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.7.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.8.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.8.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.8.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.9.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.9.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.experts.9.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.gate.e_score_correction_bias": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.gate.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.shared_experts.down_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.shared_experts.gate_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.mlp.shared_experts.up_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.post_attention_layernorm.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.self_attn.kv_a_layernorm.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.self_attn.kv_a_proj_with_mqa.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.self_attn.kv_b_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.self_attn.o_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.self_attn.q_a_layernorm.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.self_attn.q_a_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.42.self_attn.q_b_proj.weight": "model-00043-of-00048.safetensors",
+ "model.layers.43.input_layernorm.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.0.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.0.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.0.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.1.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.1.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.1.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.10.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.10.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.10.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.11.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.11.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.11.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.12.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.12.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.12.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.13.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.13.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.13.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.14.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.14.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.14.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.15.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.15.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.15.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.16.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.16.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.16.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.17.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.17.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.17.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.18.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.18.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.18.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.19.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.19.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.19.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.2.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.2.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.2.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.20.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.20.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.20.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.21.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.21.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.21.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.22.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.22.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.22.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.23.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.23.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.23.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.24.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.24.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.24.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.25.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.25.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.25.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.26.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.26.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.26.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.27.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.27.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.27.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.28.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.28.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.28.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.29.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.29.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.29.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.3.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.3.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.3.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.30.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.30.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.30.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.31.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.31.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.31.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.32.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.32.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.32.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.33.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.33.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.33.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.34.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.34.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.34.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.35.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.35.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.35.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.36.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.36.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.36.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.37.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.37.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.37.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.38.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.38.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.38.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.39.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.39.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.39.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.4.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.4.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.4.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.40.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.40.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.40.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.41.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.41.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.41.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.42.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.42.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.42.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.43.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.43.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.43.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.44.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.44.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.44.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.45.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.45.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.45.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.46.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.46.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.46.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.47.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.47.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.47.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.48.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.48.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.48.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.49.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.49.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.49.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.5.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.5.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.5.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.50.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.50.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.50.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.51.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.51.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.51.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.52.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.52.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.52.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.53.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.53.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.53.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.54.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.54.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.54.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.55.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.55.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.55.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.56.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.56.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.56.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.57.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.57.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.57.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.58.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.58.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.58.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.59.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.59.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.59.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.6.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.6.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.6.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.60.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.60.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.60.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.61.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.61.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.61.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.62.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.62.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.62.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.63.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.63.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.63.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.7.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.7.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.7.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.8.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.8.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.8.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.9.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.9.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.experts.9.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.gate.e_score_correction_bias": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.gate.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.shared_experts.down_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.shared_experts.gate_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.mlp.shared_experts.up_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.post_attention_layernorm.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.self_attn.kv_a_layernorm.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.self_attn.kv_a_proj_with_mqa.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.self_attn.kv_b_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.self_attn.o_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.self_attn.q_a_layernorm.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.self_attn.q_a_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.43.self_attn.q_b_proj.weight": "model-00044-of-00048.safetensors",
+ "model.layers.44.input_layernorm.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.0.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.0.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.0.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.1.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.1.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.1.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.10.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.10.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.10.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.11.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.11.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.11.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.12.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.12.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.12.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.13.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.13.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.13.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.14.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.14.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.14.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.15.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.15.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.15.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.16.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.16.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.16.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.17.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.17.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.17.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.18.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.18.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.18.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.19.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.19.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.19.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.2.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.2.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.2.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.20.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.20.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.20.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.21.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.21.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.21.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.22.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.22.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.22.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.23.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.23.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.23.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.24.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.24.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.24.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.25.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.25.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.25.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.26.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.26.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.26.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.27.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.27.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.27.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.28.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.28.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.28.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.29.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.29.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.29.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.3.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.3.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.3.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.30.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.30.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.30.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.31.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.31.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.31.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.32.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.32.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.32.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.33.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.33.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.33.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.34.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.34.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.34.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.35.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.35.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.35.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.36.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.36.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.36.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.37.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.37.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.37.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.38.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.38.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.38.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.39.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.39.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.39.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.4.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.4.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.4.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.40.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.40.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.40.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.41.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.41.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.41.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.42.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.42.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.42.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.43.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.43.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.43.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.44.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.44.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.44.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.45.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.45.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.45.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.46.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.46.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.46.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.47.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.47.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.47.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.48.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.48.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.48.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.49.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.49.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.49.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.5.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.5.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.5.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.50.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.50.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.50.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.51.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.51.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.51.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.52.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.52.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.52.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.53.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.53.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.53.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.54.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.54.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.54.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.55.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.55.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.55.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.56.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.56.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.56.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.57.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.57.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.57.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.58.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.58.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.58.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.59.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.59.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.59.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.6.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.6.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.6.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.60.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.60.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.60.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.61.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.61.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.61.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.62.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.62.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.62.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.63.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.63.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.63.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.7.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.7.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.7.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.8.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.8.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.8.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.9.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.9.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.experts.9.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.gate.e_score_correction_bias": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.gate.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.shared_experts.down_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.shared_experts.gate_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.mlp.shared_experts.up_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.post_attention_layernorm.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.self_attn.kv_a_layernorm.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.self_attn.kv_a_proj_with_mqa.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.self_attn.kv_b_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.self_attn.o_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.self_attn.q_a_layernorm.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.self_attn.q_a_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.44.self_attn.q_b_proj.weight": "model-00045-of-00048.safetensors",
+ "model.layers.45.input_layernorm.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.0.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.0.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.0.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.1.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.1.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.1.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.10.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.10.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.10.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.11.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.11.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.11.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.12.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.12.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.12.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.13.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.13.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.13.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.14.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.14.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.14.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.15.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.15.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.15.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.16.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.16.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.16.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.17.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.17.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.17.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.18.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.18.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.18.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.19.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.19.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.19.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.2.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.2.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.2.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.20.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.20.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.20.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.21.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.21.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.21.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.22.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.22.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.22.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.23.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.23.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.23.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.24.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.24.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.24.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.25.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.25.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.25.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.26.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.26.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.26.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.27.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.27.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.27.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.28.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.28.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.28.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.29.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.29.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.29.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.3.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.3.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.3.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.30.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.30.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.30.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.31.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.31.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.31.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.32.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.32.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.32.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.33.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.33.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.33.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.34.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.34.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.34.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.35.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.35.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.35.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.36.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.36.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.36.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.37.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.37.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.37.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.38.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.38.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.38.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.39.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.39.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.39.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.4.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.4.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.4.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.40.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.40.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.40.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.41.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.41.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.41.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.42.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.42.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.42.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.43.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.43.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.43.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.44.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.44.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.44.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.45.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.45.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.45.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.46.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.46.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.46.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.47.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.47.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.47.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.48.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.48.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.48.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.49.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.49.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.49.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.5.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.5.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.5.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.50.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.50.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.50.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.51.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.51.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.51.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.52.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.52.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.52.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.53.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.53.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.53.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.54.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.54.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.54.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.55.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.55.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.55.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.56.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.56.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.56.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.57.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.57.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.57.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.58.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.58.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.58.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.59.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.59.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.59.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.6.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.6.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.6.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.60.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.60.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.60.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.61.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.61.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.61.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.62.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.62.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.62.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.63.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.63.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.63.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.7.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.7.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.7.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.8.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.8.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.8.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.9.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.9.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.experts.9.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.gate.e_score_correction_bias": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.gate.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.shared_experts.down_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.shared_experts.gate_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.mlp.shared_experts.up_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.post_attention_layernorm.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.self_attn.kv_a_layernorm.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.self_attn.kv_a_proj_with_mqa.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.self_attn.kv_b_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.self_attn.o_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.self_attn.q_a_layernorm.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.self_attn.q_a_proj.weight": "model-00046-of-00048.safetensors",
+ "model.layers.45.self_attn.q_b_proj.weight": "model-00046-of-00048.safetensors",
+ "lm_head.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.input_layernorm.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.0.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.0.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.0.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.1.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.1.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.1.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.10.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.10.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.10.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.11.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.11.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.11.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.12.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.12.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.12.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.13.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.13.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.13.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.14.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.14.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.14.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.15.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.15.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.15.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.16.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.16.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.16.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.17.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.17.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.17.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.18.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.18.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.18.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.19.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.19.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.19.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.2.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.2.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.2.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.20.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.20.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.20.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.21.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.21.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.21.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.22.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.22.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.22.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.23.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.23.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.23.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.24.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.24.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.24.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.25.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.25.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.25.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.26.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.26.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.26.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.27.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.27.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.27.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.28.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.28.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.28.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.29.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.29.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.29.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.3.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.3.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.3.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.30.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.30.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.30.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.31.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.31.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.31.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.32.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.32.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.32.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.33.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.33.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.33.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.34.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.34.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.34.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.35.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.35.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.35.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.36.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.36.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.36.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.37.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.37.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.37.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.38.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.38.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.38.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.39.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.39.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.39.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.4.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.4.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.4.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.40.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.40.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.40.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.41.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.41.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.41.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.42.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.42.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.42.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.43.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.43.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.43.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.44.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.44.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.44.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.45.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.45.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.45.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.46.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.46.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.46.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.47.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.47.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.47.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.48.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.48.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.48.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.49.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.49.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.49.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.5.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.5.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.5.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.50.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.50.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.50.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.51.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.51.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.51.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.52.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.52.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.52.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.53.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.53.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.53.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.54.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.54.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.54.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.55.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.55.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.55.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.56.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.56.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.56.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.57.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.57.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.57.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.58.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.58.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.58.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.59.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.59.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.59.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.6.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.6.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.6.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.60.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.60.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.60.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.61.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.61.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.61.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.62.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.62.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.62.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.63.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.63.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.63.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.7.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.7.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.7.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.8.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.8.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.8.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.9.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.9.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.experts.9.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.gate.e_score_correction_bias": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.gate.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.shared_experts.down_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.shared_experts.gate_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.mlp.shared_experts.up_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.post_attention_layernorm.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.self_attn.kv_a_layernorm.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.self_attn.kv_a_proj_with_mqa.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.self_attn.kv_b_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.self_attn.o_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.self_attn.q_a_layernorm.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.self_attn.q_a_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.46.self_attn.q_b_proj.weight": "model-00047-of-00048.safetensors",
+ "model.layers.47.shared_head.head.weight": "model-00047-of-00048.safetensors",
+ "model.norm.weight": "model-00047-of-00048.safetensors",
+ "model.layers.47.eh_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.enorm.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.hnorm.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.input_layernorm.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.0.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.0.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.0.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.1.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.1.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.1.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.10.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.10.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.10.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.11.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.11.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.11.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.12.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.12.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.12.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.13.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.13.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.13.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.14.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.14.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.14.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.15.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.15.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.15.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.16.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.16.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.16.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.17.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.17.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.17.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.18.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.18.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.18.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.19.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.19.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.19.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.2.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.2.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.2.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.20.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.20.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.20.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.21.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.21.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.21.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.22.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.22.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.22.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.23.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.23.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.23.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.24.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.24.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.24.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.25.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.25.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.25.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.26.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.26.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.26.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.27.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.27.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.27.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.28.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.28.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.28.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.29.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.29.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.29.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.3.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.3.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.3.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.30.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.30.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.30.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.31.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.31.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.31.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.32.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.32.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.32.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.33.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.33.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.33.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.34.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.34.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.34.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.35.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.35.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.35.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.36.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.36.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.36.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.37.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.37.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.37.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.38.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.38.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.38.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.39.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.39.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.39.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.4.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.4.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.4.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.40.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.40.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.40.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.41.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.41.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.41.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.42.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.42.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.42.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.43.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.43.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.43.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.44.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.44.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.44.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.45.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.45.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.45.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.46.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.46.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.46.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.47.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.47.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.47.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.48.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.48.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.48.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.49.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.49.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.49.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.5.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.5.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.5.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.50.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.50.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.50.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.51.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.51.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.51.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.52.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.52.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.52.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.53.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.53.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.53.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.54.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.54.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.54.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.55.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.55.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.55.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.56.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.56.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.56.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.57.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.57.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.57.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.58.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.58.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.58.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.59.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.59.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.59.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.6.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.6.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.6.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.60.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.60.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.60.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.61.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.61.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.61.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.62.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.62.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.62.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.63.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.63.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.63.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.7.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.7.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.7.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.8.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.8.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.8.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.9.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.9.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.experts.9.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.gate.e_score_correction_bias": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.gate.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.shared_experts.down_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.shared_experts.gate_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.mlp.shared_experts.up_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.post_attention_layernorm.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.self_attn.kv_a_layernorm.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.self_attn.kv_a_proj_with_mqa.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.self_attn.kv_b_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.self_attn.o_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.self_attn.q_a_layernorm.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.self_attn.q_a_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.self_attn.q_b_proj.weight": "model-00048-of-00048.safetensors",
+ "model.layers.47.shared_head.norm.weight": "model-00048-of-00048.safetensors"
+ }
+}
\ No newline at end of file
diff --git a/special_tokens_map.json b/special_tokens_map.json
new file mode 100644
index 0000000000000000000000000000000000000000..e55d4ebc804e6777c73f1ced0c44c4b2964cea0d
--- /dev/null
+++ b/special_tokens_map.json
@@ -0,0 +1,36 @@
+{
+ "additional_special_tokens": [
+ "<|endoftext|>",
+ "[MASK]",
+ "[gMASK]",
+ "[sMASK]",
+ "",
+ "",
+ "<|system|>",
+ "<|user|>",
+ "<|assistant|>",
+ "<|observation|>",
+ "<|begin_of_image|>",
+ "<|end_of_image|>",
+ "<|begin_of_video|>",
+ "<|end_of_video|>",
+ "<|begin_of_audio|>",
+ "<|end_of_audio|>",
+ "<|begin_of_transcription|>",
+ "<|end_of_transcription|>"
+ ],
+ "eos_token": {
+ "content": "<|endoftext|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false
+ },
+ "pad_token": {
+ "content": "[MASK]",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false
+ }
+}
diff --git a/tokenizer.json b/tokenizer.json
new file mode 100644
index 0000000000000000000000000000000000000000..aba40197a4cdb5607f4ab7a05fb0a4ee8054fd6d
--- /dev/null
+++ b/tokenizer.json
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:19e773648cb4e65de8660ea6365e10acca112d42a854923df93db4a6f333a82d
+size 20217442
diff --git a/tokenizer_config.json b/tokenizer_config.json
new file mode 100644
index 0000000000000000000000000000000000000000..f9e45b9c5a1d2e11b7657c81be99414c6002d732
--- /dev/null
+++ b/tokenizer_config.json
@@ -0,0 +1,324 @@
+{
+ "added_tokens_decoder": {
+ "154820": {
+ "content": "<|endoftext|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154821": {
+ "content": "[MASK]",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154822": {
+ "content": "[gMASK]",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154823": {
+ "content": "[sMASK]",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154824": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154825": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154826": {
+ "content": "<|system|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154827": {
+ "content": "<|user|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154828": {
+ "content": "<|assistant|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154829": {
+ "content": "<|observation|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154830": {
+ "content": "<|begin_of_image|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154831": {
+ "content": "<|end_of_image|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154832": {
+ "content": "<|begin_of_video|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154833": {
+ "content": "<|end_of_video|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154834": {
+ "content": "<|begin_of_audio|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154835": {
+ "content": "<|end_of_audio|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154836": {
+ "content": "<|begin_of_transcription|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154837": {
+ "content": "<|end_of_transcription|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "154838": {
+ "content": "<|code_prefix|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154839": {
+ "content": "<|code_middle|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154840": {
+ "content": "<|code_suffix|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154841": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154842": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154843": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154844": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154845": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154846": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154847": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154848": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154849": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154850": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154851": {
+ "content": "/nothink",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154852": {
+ "content": "<|begin_of_box|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154853": {
+ "content": "<|end_of_box|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154854": {
+ "content": "<|image|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "154855": {
+ "content": "<|video|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ }
+ },
+ "additional_special_tokens": [
+ "<|endoftext|>",
+ "[MASK]",
+ "[gMASK]",
+ "[sMASK]",
+ "",
+ "",
+ "<|system|>",
+ "<|user|>",
+ "<|assistant|>",
+ "<|observation|>",
+ "<|begin_of_image|>",
+ "<|end_of_image|>",
+ "<|begin_of_video|>",
+ "<|end_of_video|>",
+ "<|begin_of_audio|>",
+ "<|end_of_audio|>",
+ "<|begin_of_transcription|>",
+ "<|end_of_transcription|>"
+ ],
+ "bos_token": null,
+ "clean_up_tokenization_spaces": false,
+ "do_lower_case": false,
+ "eos_token": "<|endoftext|>",
+ "extra_special_tokens": {},
+ "model_max_length": 128000,
+ "pad_token": "[MASK]",
+ "padding_side": "left",
+ "remove_space": false,
+ "tokenizer_class": "PreTrainedTokenizerFast",
+ "unk_token": null,
+ "chat_template": "[gMASK]\n{%- if tools -%}\n<|system|>\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within XML tags:\n\n{% for tool in tools %}\n{{ tool | tojson(ensure_ascii=False) }}\n{% endfor %}\n\n\nFor each function call, output the function name and arguments within the following XML format:\n{function-name}{arg-key-1}{arg-value-1}{arg-key-2}{arg-value-2}...{%- endif -%}\n{%- macro visible_text(content) -%}\n {%- if content is string -%}\n {{- content }}\n {%- elif content is iterable and content is not mapping -%}\n {%- for item in content -%}\n {%- if item is mapping and item.type == 'text' -%}\n {{- item.text }}\n {%- elif item is string -%}\n {{- item }}\n {%- endif -%}\n {%- endfor -%}\n {%- else -%}\n {{- content }}\n {%- endif -%}\n{%- endmacro -%}\n{%- set ns = namespace(last_user_index=-1) %}\n{%- for m in messages %}\n {%- if m.role == 'user' %}\n {% set ns.last_user_index = loop.index0 -%}\n {%- endif %}\n{%- endfor %}\n{% for m in messages %}\n{%- if m.role == 'user' -%}<|user|>{{ visible_text(m.content) }}\n{%- elif m.role == 'assistant' -%}\n<|assistant|>\n{%- set reasoning_content = '' %}\n{%- set content = visible_text(m.content) %}\n{%- if m.reasoning_content is string %}\n {%- set reasoning_content = m.reasoning_content %}\n{%- else %}\n {%- if '' in content %}\n {%- set reasoning_content = content.split('')[0].rstrip('\\n').split('')[-1].lstrip('\\n') %}\n {%- set content = content.split('')[-1].lstrip('\\n') %}\n {%- endif %}\n{%- endif %}\n{%- if ((clear_thinking is defined and not clear_thinking) or loop.index0 > ns.last_user_index) and reasoning_content -%}\n{{ '' + reasoning_content.strip() + ''}}\n{%- else -%}\n{{ '' }}\n{%- endif -%}\n{%- if content.strip() -%}\n{{ content.strip() }}\n{%- endif -%}\n{% if m.tool_calls %}\n{% for tc in m.tool_calls %}\n{%- if tc.function %}\n {%- set tc = tc.function %}\n{%- endif %}\n{{- '' + tc.name -}}\n{% set _args = tc.arguments %}{% for k, v in _args.items() %}{{ k }}{{ v | tojson(ensure_ascii=False) if v is not string else v }}{% endfor %}{% endfor %}\n{% endif %}\n{%- elif m.role == 'tool' -%}\n{%- if m.content is string -%}\n{%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|observation|>' }}\n{%- endif %}\n{{- '' }}\n{{- m.content }}\n{{- '' }}\n{%- else -%}\n<|observation|>{% for tr in m.content %}\n{{ tr.output if tr.output is defined else tr }}{% endfor -%}\n{% endif -%}\n{%- elif m.role == 'system' -%}\n<|system|>{{ visible_text(m.content) }}\n{%- endif -%}\n{%- endfor -%}\n{%- if add_generation_prompt -%}\n <|assistant|>{{- '' if (enable_thinking is defined and not enable_thinking) else '' -}}\n{%- endif -%}"
+}
\ No newline at end of file