diff --git a/.gitattributes b/.gitattributes
new file mode 100644
index 0000000000000000000000000000000000000000..aa7aacd0134a92c3c1943fdecc75cd8b7420cce6
--- /dev/null
+++ b/.gitattributes
@@ -0,0 +1,37 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+model.safetensors.index.json filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text
diff --git a/README.md b/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..d56ca11060668ec866d9654b6177c85d1ec28946
--- /dev/null
+++ b/README.md
@@ -0,0 +1,115 @@
+---
+language:
+- en
+- zh
+library_name: transformers
+license: mit
+pipeline_tag: text-generation
+base_model:
+- zai-org/GLM-5.1
+tags:
+- unsloth
+- glm_moe_dsa
+---
+
+
+You can follow instructions in our [guide here](https://unsloth.ai/docs/models/glm-5.1).
+
+---
+
+
+# GLM-5.1
+
+
+

+
+
+ đź‘‹ Join our WeChat or Discord community.
+
+ đź“– Check out the GLM-5.1 blog and GLM-5 Technical report.
+
+ 📍 Use GLM-5.1 API services on Z.ai API Platform.
+
+ 🔜 GLM-5.1 will be available on chat.z.ai in the coming days.
+
+
+
+ [Paper]
+ [GitHub]
+
+
+## Introduction
+
+GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin on NL2Repo (repo generation) and Terminal-Bench 2.0 (real-world terminal tasks).
+
+
+
+But the most meaningful leap goes beyond first-pass performance. Previous models—including GLM-5—tend to exhaust their repertoire early: they apply familiar techniques for quick initial gains, then plateau. Giving them more time doesn't help.
+
+GLM-5.1, by contrast, is built to stay effective on agentic tasks over much longer horizons. We've found that the model handles ambiguous problems with better judgment and stays productive over longer sessions. It breaks complex problems down, runs experiments, reads results, and identifies blockers with real precision. By revisiting its reasoning and revising its strategy through repeated iteration, GLM-5.1 sustains optimization over hundreds of rounds and thousands of tool calls. The longer it runs, the better the result.
+
+## Benchmark
+
+| | GLM-5.1 | GLM-5 | Qwen3.6-Plus | Minimax M2.7 | DeepSeek-V3.2 | Kimi K2.5 | Claude Opus 4.6 | Gemini 3.1 Pro | GPT-5.4 |
+| ------------------------------------------ | ------------------ | ------------------- | ------------ | -------------------- | -------------------- | ---------- | --------------- | -------------- | ---------------- |
+| HLE | 31.0 | 30.5 | 28.8 | 28.0 | 25.1 | 31.5 | 36.7 | **45.0** | 39.8 |
+| HLE (w/ Tools) | 52.3 | 50.4 | 50.6 | - | 40.8 | 51.8 | **53.1*** | 51.4* | 52.1* |
+| AIME 2026 | 95.3 | 95.4 | 95.1 | 89.8 | 95.1 | 94.5 | 95.6 | 98.2 | **98.7** |
+| HMMT Nov. 2025 | 94.0 | **96.9** | 94.6 | 81.0 | 90.2 | 91.1 | 96.3 | 94.8 | 95.8 |
+| HMMT Feb. 2026 | 82.6 | 82.8 | 87.8 | 72.7 | 79.9 | 81.3 | 84.3 | 87.3 | **91.8** |
+| IMOAnswerBench | 83.8 | 82.5 | 83.8 | 66.3 | 78.3 | 81.8 | 75.3 | 81.0 | **91.4** |
+| GPQA-Diamond | 86.2 | 86.0 | 90.4 | 87.0 | 82.4 | 87.6 | 91.3 | **94.3** | 92.0 |
+| SWE-Bench Pro | **58.4** | 55.1 | 56.6 | 56.2 | - | 53.8 | 57.3 | 54.2 | 57.7 |
+| NL2Repo | 42.7 | 35.9 | 37.9 | 39.8 | - | 32.0 | **49.8** | 33.4 | 41.3 |
+| Terminal-Bench 2.0 (Terminus-2) | 63.5 | 56.2 | 61.6 | - | 39.3 | 50.8 | 65.4 | **68.5** | - |
+| Terminal-Bench 2.0 (Best self-reported) | 66.5 (Claude Code) | 56.2 (Claude Code) | - | 57.0 (Claude Code) | 46.4 (Claude Code) | - | - | - | **75.1** (Codex) |
+| CyberGym | **68.7** | 48.3 | - | - | 17.3 | 41.3 | 66.6 | - | - |
+| BrowseComp | **68.0** | 62.0 | - | - | 51.4 | 60.6 | - | - | - |
+| BrowseComp (w/ Context Manage) | 79.3 | 75.9 | - | - | 67.6 | 74.9 | 84.0 | **85.9** | 82.7 |
+| τ³-Bench | 70.6 | 69.2 | 70.7 | 67.6 | 69.2 | 66.0 | 72.4 | 67.1 | **72.9** |
+| MCP-Atlas (Public Set) | 71.8 | 69.2 | **74.1** | 48.8 | 62.2 | 63.8 | 73.8 | 69.2 | 67.2 |
+| Tool-Decathlon | 40.7 | 38.0 | 39.8 | 46.3 | 35.2 | 27.8 | 47.2 | 48.8 | **54.6** |
+| Vending Bench 2 | $5,634.00 | $4,432.12 | $5,114.87 | - | $1,034.00 | $1,198.46 | **$8,017.59** | $911.21 | $6,144.18 |
+## Serve GLM-5.1 Locally
+
+The following open-source frameworks support local deployment of GLM-5.1:
+
+- [SGLang](https://github.com/sgl-project/sglang) (v0.5.10+) — see [cookbook](https://cookbook.sglang.io/autoregressive/GLM/GLM-5.1)
+- [vLLM](https://github.com/vllm-project/vllm) (v0.19.0+) — see [recipes](https://github.com/vllm-project/recipes/blob/main/GLM/GLM5.md)
+- [xLLM](https://github.com/jd-opensource/xllm) (v0.8.0+) — see [example](https://github.com/zai-org/GLM-5/blob/main/example/ascend.md)
+- [Transformers](https://github.com/huggingface/transformers) (v0.5.3+) — see [transformers docs](https://github.com/huggingface/transformers/blob/main/docs/source/en/model_doc/glm_moe_dsa.md)
+- [KTransformers](https://github.com/kvcache-ai/ktransformers) (v0.5.3+) — see [tutorial](https://github.com/kvcache-ai/ktransformers/blob/main/doc/en/kt-kernel/GLM-5.1-Tutorial.md)
+
+## Citation
+
+If you find GLM-5.1 or GLM-5 useful in your research, please cite our technical report:
+
+```bibtex
+@misc{glm5team2026glm5vibecodingagentic,
+ title={GLM-5: from Vibe Coding to Agentic Engineering},
+ author={GLM-5-Team and : and Aohan Zeng and Xin Lv and Zhenyu Hou and Zhengxiao Du and Qinkai Zheng and Bin Chen and Da Yin and Chendi Ge and Chenghua Huang and Chengxing Xie and Chenzheng Zhu and Congfeng Yin and Cunxiang Wang and Gengzheng Pan and Hao Zeng and Haoke Zhang and Haoran Wang and Huilong Chen and Jiajie Zhang and Jian Jiao and Jiaqi Guo and Jingsen Wang and Jingzhao Du and Jinzhu Wu and Kedong Wang and Lei Li and Lin Fan and Lucen Zhong and Mingdao Liu and Mingming Zhao and Pengfan Du and Qian Dong and Rui Lu and Shuang-Li and Shulin Cao and Song Liu and Ting Jiang and Xiaodong Chen and Xiaohan Zhang and Xuancheng Huang and Xuezhen Dong and Yabo Xu and Yao Wei and Yifan An and Yilin Niu and Yitong Zhu and Yuanhao Wen and Yukuo Cen and Yushi Bai and Zhongpei Qiao and Zihan Wang and Zikang Wang and Zilin Zhu and Ziqiang Liu and Zixuan Li and Bojie Wang and Bosi Wen and Can Huang and Changpeng Cai and Chao Yu and Chen Li and Chengwei Hu and Chenhui Zhang and Dan Zhang and Daoyan Lin and Dayong Yang and Di Wang and Ding Ai and Erle Zhu and Fangzhou Yi and Feiyu Chen and Guohong Wen and Hailong Sun and Haisha Zhao and Haiyi Hu and Hanchen Zhang and Hanrui Liu and Hanyu Zhang and Hao Peng and Hao Tai and Haobo Zhang and He Liu and Hongwei Wang and Hongxi Yan and Hongyu Ge and Huan Liu and Huanpeng Chu and Jia'ni Zhao and Jiachen Wang and Jiajing Zhao and Jiamin Ren and Jiapeng Wang and Jiaxin Zhang and Jiayi Gui and Jiayue Zhao and Jijie Li and Jing An and Jing Li and Jingwei Yuan and Jinhua Du and Jinxin Liu and Junkai Zhi and Junwen Duan and Kaiyue Zhou and Kangjian Wei and Ke Wang and Keyun Luo and Laiqiang Zhang and Leigang Sha and Liang Xu and Lindong Wu and Lintao Ding and Lu Chen and Minghao Li and Nianyi Lin and Pan Ta and Qiang Zou and Rongjun Song and Ruiqi Yang and Shangqing Tu and Shangtong Yang and Shaoxiang Wu and Shengyan Zhang and Shijie Li and Shuang Li and Shuyi Fan and Wei Qin and Wei Tian and Weining Zhang and Wenbo Yu and Wenjie Liang and Xiang Kuang and Xiangmeng Cheng and Xiangyang Li and Xiaoquan Yan and Xiaowei Hu and Xiaoying Ling and Xing Fan and Xingye Xia and Xinyuan Zhang and Xinze Zhang and Xirui Pan and Xu Zou and Xunkai Zhang and Yadi Liu and Yandong Wu and Yanfu Li and Yidong Wang and Yifan Zhu and Yijun Tan and Yilin Zhou and Yiming Pan and Ying Zhang and Yinpei Su and Yipeng Geng and Yong Yan and Yonglin Tan and Yuean Bi and Yuhan Shen and Yuhao Yang and Yujiang Li and Yunan Liu and Yunqing Wang and Yuntao Li and Yurong Wu and Yutao Zhang and Yuxi Duan and Yuxuan Zhang and Zezhen Liu and Zhengtao Jiang and Zhenhe Yan and Zheyu Zhang and Zhixiang Wei and Zhuo Chen and Zhuoer Feng and Zijun Yao and Ziwei Chai and Ziyuan Wang and Zuzhou Zhang and Bin Xu and Minlie Huang and Hongning Wang and Juanzi Li and Yuxiao Dong and Jie Tang},
+ year={2026},
+ eprint={2602.15763},
+ archivePrefix={arXiv},
+ primaryClass={cs.LG},
+ url={https://arxiv.org/abs/2602.15763},
+}
+```
\ No newline at end of file
diff --git a/chat_template.jinja b/chat_template.jinja
new file mode 100644
index 0000000000000000000000000000000000000000..0093efaa15b9ee3b0d8799ec64933fe0897b6687
--- /dev/null
+++ b/chat_template.jinja
@@ -0,0 +1,117 @@
+[gMASK]
+{%- if tools -%}
+{%- macro tool_to_json(tool) -%}
+ {%- set ns_tool = namespace(first=true) -%}
+ {{ '{' -}}
+ {%- for k, v in tool.items() -%}
+ {%- if k != 'defer_loading' and k != 'strict' -%}
+ {%- if not ns_tool.first -%}{{- ', ' -}}{%- endif -%}
+ {%- set ns_tool.first = false -%}
+ "{{ k }}": {{ v | tojson(ensure_ascii=False) }}
+ {%- endif -%}
+ {%- endfor -%}
+ {{- '}' -}}
+{%- endmacro -%}
+<|system|>
+# Tools
+
+You may call one or more functions to assist with the user query.
+
+You are provided with function signatures within XML tags:
+
+{% for tool in tools %}
+{%- if 'function' in tool -%}
+ {%- set tool = tool['function'] -%}
+{%- endif -%}
+{% if tool.defer_loading is not defined or not tool.defer_loading %}
+{{ tool_to_json(tool) }}
+{% endif %}
+{% endfor %}
+
+
+For each function call, output the function name and arguments within the following XML format:
+{function-name}{arg-key-1}{arg-value-1}{arg-key-2}{arg-value-2}...{%- endif -%}
+{%- macro visible_text(content) -%}
+ {%- if content is string -%}
+ {{- content }}
+ {%- elif content is iterable and content is not mapping -%}
+ {%- for item in content -%}
+ {%- if item is mapping and item.type == 'text' -%}
+ {{- item.text }}
+ {%- elif item is string -%}
+ {{- item }}
+ {%- endif -%}
+ {%- endfor -%}
+ {%- else -%}
+ {{- content }}
+ {%- endif -%}
+{%- endmacro -%}
+{%- set ns = namespace(last_user_index=-1, thinking_indices='') -%}
+{%- for m in messages %}
+ {%- if m.role == 'user' %}
+ {%- set ns.last_user_index = loop.index0 -%}
+ {%- elif m.role == 'assistant' %}
+ {%- if m.reasoning_content is string %}
+ {%- set ns.thinking_indices = ns.thinking_indices ~ ',' ~ ns.last_user_index ~ ',' -%}
+ {%- endif %}
+ {%- endif %}
+{%- endfor %}
+{%- set ns.has_thinking = false -%}
+{%- for m in messages -%}
+{%- if m.role == 'user' -%}<|user|>{{ visible_text(m.content) }}{% set ns.has_thinking = (',' ~ loop.index0 ~ ',') in ns.thinking_indices -%}
+{%- elif m.role == 'assistant' -%}
+<|assistant|>
+{%- set content = visible_text(m.content) %}
+{%- if m.reasoning_content is string %}
+ {%- set reasoning_content = m.reasoning_content %}
+{%- elif '' in content %}
+ {%- set reasoning_content = content.split('')[0].split('')[-1] %}
+ {%- set content = content.split('')[-1] %}
+{%- elif loop.index0 > ns.last_user_index and not (enable_thinking is defined and not enable_thinking) %}
+ {%- set reasoning_content = '' %}
+{%- elif loop.index0 < ns.last_user_index and ns.has_thinking %}
+ {%- set reasoning_content = '' %}
+{%- endif %}
+{%- if ((clear_thinking is defined and not clear_thinking) or loop.index0 > ns.last_user_index) and reasoning_content is defined -%}
+{{ '' + reasoning_content + ''}}
+{%- else -%}
+{{ '' }}
+{%- endif -%}
+{%- if content.strip() -%}
+{{ content.strip() }}
+{%- endif -%}
+{% if m.tool_calls %}
+{% for tc in m.tool_calls %}
+{%- if tc.function %}
+ {%- set tc = tc.function %}
+{%- endif %}
+{{- '' + tc.name -}}
+{% set _args = tc.arguments %}{% for k, v in _args.items() %}{{ k }}{{ v | tojson(ensure_ascii=False) if v is not string else v }}{% endfor %}{% endfor %}
+{% endif %}
+{%- elif m.role == 'tool' -%}
+{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+ {{- '<|observation|>' -}}
+{%- endif %}
+{%- if m.content is string -%}
+ {{- '' + m.content + '' -}}
+{%- else -%}
+ {{- '\n' -}}
+ {% for tr in m.content %}
+ {%- for tool in tools -%}
+ {%- if 'function' in tool -%}
+ {%- set tool = tool['function'] -%}
+ {%- endif -%}
+ {%- if tool.name == tr.name -%}
+ {{- tool_to_json(tool) + '\n' -}}
+ {%- endif -%}
+ {%- endfor -%}
+ {%- endfor -%}
+ {{- '' -}}
+{% endif -%}
+{%- elif m.role == 'system' -%}
+<|system|>{{ visible_text(m.content) }}
+{%- endif -%}
+{%- endfor -%}
+{%- if add_generation_prompt -%}
+ <|assistant|>{{- '' if (enable_thinking is defined and not enable_thinking) else '' -}}
+{%- endif -%}
\ No newline at end of file
diff --git a/config.json b/config.json
new file mode 100644
index 0000000000000000000000000000000000000000..e5ac5ceb7bc918aff3fc517f44fea80e9b5914dc
--- /dev/null
+++ b/config.json
@@ -0,0 +1,862 @@
+{
+ "architectures": [
+ "GlmMoeDsaForCausalLM"
+ ],
+ "attention_bias": false,
+ "attention_dropout": 0.0,
+ "torch_dtype": "bfloat16",
+ "eos_token_id": [
+ 154820,
+ 154827,
+ 154829
+ ],
+ "ep_size": 1,
+ "first_k_dense_replace": 3,
+ "hidden_act": "silu",
+ "hidden_size": 6144,
+ "index_head_dim": 128,
+ "index_n_heads": 32,
+ "index_topk": 2048,
+ "indexer_rope_interleave": true,
+ "initializer_range": 0.02,
+ "intermediate_size": 12288,
+ "kv_lora_rank": 512,
+ "max_position_embeddings": 202752,
+ "mlp_layer_types": [
+ "dense",
+ "dense",
+ "dense",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse",
+ "sparse"
+ ],
+ "model_type": "glm_moe_dsa",
+ "moe_intermediate_size": 2048,
+ "moe_layer_freq": 1,
+ "n_group": 1,
+ "n_routed_experts": 256,
+ "n_shared_experts": 1,
+ "norm_topk_prob": true,
+ "num_attention_heads": 64,
+ "num_experts_per_tok": 8,
+ "num_hidden_layers": 78,
+ "num_key_value_heads": 64,
+ "num_nextn_predict_layers": 1,
+ "pad_token_id": 154821,
+ "pretraining_tp": 1,
+ "q_lora_rank": 2048,
+ "qk_head_dim": 256,
+ "qk_nope_head_dim": 192,
+ "qk_rope_head_dim": 64,
+ "quantization_config": {
+ "activation_scheme": "dynamic",
+ "fmt": "e4m3",
+ "modules_to_not_convert": [
+ "lm_head",
+ "model.embed_tokens",
+ "model.layers.0.input_layernorm",
+ "model.layers.0.post_attention_layernorm",
+ "model.layers.0.self_attn.indexer.k_norm",
+ "model.layers.0.self_attn.indexer.k_norm.bias",
+ "model.layers.0.self_attn.indexers_proj",
+ "model.layers.0.self_attn.kv_a_layernorm",
+ "model.layers.0.self_attn.q_a_layernorm",
+ "model.layers.1.input_layernorm",
+ "model.layers.1.post_attention_layernorm",
+ "model.layers.1.self_attn.indexer.k_norm",
+ "model.layers.1.self_attn.indexer.k_norm.bias",
+ "model.layers.1.self_attn.indexers_proj",
+ "model.layers.1.self_attn.kv_a_layernorm",
+ "model.layers.1.self_attn.q_a_layernorm",
+ "model.layers.2.input_layernorm",
+ "model.layers.2.post_attention_layernorm",
+ "model.layers.2.self_attn.indexer.k_norm",
+ "model.layers.2.self_attn.indexer.k_norm.bias",
+ "model.layers.2.self_attn.indexers_proj",
+ "model.layers.2.self_attn.kv_a_layernorm",
+ "model.layers.2.self_attn.q_a_layernorm",
+ "model.layers.3.input_layernorm",
+ "model.layers.3.mlp.gate",
+ "model.layers.3.mlp.gate.e_score_correction_bias",
+ "model.layers.3.post_attention_layernorm",
+ "model.layers.3.self_attn.indexer.k_norm",
+ "model.layers.3.self_attn.indexer.k_norm.bias",
+ "model.layers.3.self_attn.indexers_proj",
+ "model.layers.3.self_attn.kv_a_layernorm",
+ "model.layers.3.self_attn.q_a_layernorm",
+ "model.layers.4.input_layernorm",
+ "model.layers.4.mlp.gate",
+ "model.layers.4.mlp.gate.e_score_correction_bias",
+ "model.layers.4.post_attention_layernorm",
+ "model.layers.4.self_attn.indexer.k_norm",
+ "model.layers.4.self_attn.indexer.k_norm.bias",
+ "model.layers.4.self_attn.indexers_proj",
+ "model.layers.4.self_attn.kv_a_layernorm",
+ "model.layers.4.self_attn.q_a_layernorm",
+ "model.layers.5.input_layernorm",
+ "model.layers.5.mlp.gate",
+ "model.layers.5.mlp.gate.e_score_correction_bias",
+ "model.layers.5.post_attention_layernorm",
+ "model.layers.5.self_attn.indexer.k_norm",
+ "model.layers.5.self_attn.indexer.k_norm.bias",
+ "model.layers.5.self_attn.indexers_proj",
+ "model.layers.5.self_attn.kv_a_layernorm",
+ "model.layers.5.self_attn.q_a_layernorm",
+ "model.layers.6.input_layernorm",
+ "model.layers.6.mlp.gate",
+ "model.layers.6.mlp.gate.e_score_correction_bias",
+ "model.layers.6.post_attention_layernorm",
+ "model.layers.6.self_attn.indexer.k_norm",
+ "model.layers.6.self_attn.indexer.k_norm.bias",
+ "model.layers.6.self_attn.indexers_proj",
+ "model.layers.6.self_attn.kv_a_layernorm",
+ "model.layers.6.self_attn.q_a_layernorm",
+ "model.layers.7.input_layernorm",
+ "model.layers.7.mlp.gate",
+ "model.layers.7.mlp.gate.e_score_correction_bias",
+ "model.layers.7.post_attention_layernorm",
+ "model.layers.7.self_attn.indexer.k_norm",
+ "model.layers.7.self_attn.indexer.k_norm.bias",
+ "model.layers.7.self_attn.indexers_proj",
+ "model.layers.7.self_attn.kv_a_layernorm",
+ "model.layers.7.self_attn.q_a_layernorm",
+ "model.layers.8.input_layernorm",
+ "model.layers.8.mlp.gate",
+ "model.layers.8.mlp.gate.e_score_correction_bias",
+ "model.layers.8.post_attention_layernorm",
+ "model.layers.8.self_attn.indexer.k_norm",
+ "model.layers.8.self_attn.indexer.k_norm.bias",
+ "model.layers.8.self_attn.indexers_proj",
+ "model.layers.8.self_attn.kv_a_layernorm",
+ "model.layers.8.self_attn.q_a_layernorm",
+ "model.layers.9.input_layernorm",
+ "model.layers.9.mlp.gate",
+ "model.layers.9.mlp.gate.e_score_correction_bias",
+ "model.layers.9.post_attention_layernorm",
+ "model.layers.9.self_attn.indexer.k_norm",
+ "model.layers.9.self_attn.indexer.k_norm.bias",
+ "model.layers.9.self_attn.indexers_proj",
+ "model.layers.9.self_attn.kv_a_layernorm",
+ "model.layers.9.self_attn.q_a_layernorm",
+ "model.layers.10.input_layernorm",
+ "model.layers.10.mlp.gate",
+ "model.layers.10.mlp.gate.e_score_correction_bias",
+ "model.layers.10.post_attention_layernorm",
+ "model.layers.10.self_attn.indexer.k_norm",
+ "model.layers.10.self_attn.indexer.k_norm.bias",
+ "model.layers.10.self_attn.indexers_proj",
+ "model.layers.10.self_attn.kv_a_layernorm",
+ "model.layers.10.self_attn.q_a_layernorm",
+ "model.layers.11.input_layernorm",
+ "model.layers.11.mlp.gate",
+ "model.layers.11.mlp.gate.e_score_correction_bias",
+ "model.layers.11.post_attention_layernorm",
+ "model.layers.11.self_attn.indexer.k_norm",
+ "model.layers.11.self_attn.indexer.k_norm.bias",
+ "model.layers.11.self_attn.indexers_proj",
+ "model.layers.11.self_attn.kv_a_layernorm",
+ "model.layers.11.self_attn.q_a_layernorm",
+ "model.layers.12.input_layernorm",
+ "model.layers.12.mlp.gate",
+ "model.layers.12.mlp.gate.e_score_correction_bias",
+ "model.layers.12.post_attention_layernorm",
+ "model.layers.12.self_attn.indexer.k_norm",
+ "model.layers.12.self_attn.indexer.k_norm.bias",
+ "model.layers.12.self_attn.indexers_proj",
+ "model.layers.12.self_attn.kv_a_layernorm",
+ "model.layers.12.self_attn.q_a_layernorm",
+ "model.layers.13.input_layernorm",
+ "model.layers.13.mlp.gate",
+ "model.layers.13.mlp.gate.e_score_correction_bias",
+ "model.layers.13.post_attention_layernorm",
+ "model.layers.13.self_attn.indexer.k_norm",
+ "model.layers.13.self_attn.indexer.k_norm.bias",
+ "model.layers.13.self_attn.indexers_proj",
+ "model.layers.13.self_attn.kv_a_layernorm",
+ "model.layers.13.self_attn.q_a_layernorm",
+ "model.layers.14.input_layernorm",
+ "model.layers.14.mlp.gate",
+ "model.layers.14.mlp.gate.e_score_correction_bias",
+ "model.layers.14.post_attention_layernorm",
+ "model.layers.14.self_attn.indexer.k_norm",
+ "model.layers.14.self_attn.indexer.k_norm.bias",
+ "model.layers.14.self_attn.indexers_proj",
+ "model.layers.14.self_attn.kv_a_layernorm",
+ "model.layers.14.self_attn.q_a_layernorm",
+ "model.layers.15.input_layernorm",
+ "model.layers.15.mlp.gate",
+ "model.layers.15.mlp.gate.e_score_correction_bias",
+ "model.layers.15.post_attention_layernorm",
+ "model.layers.15.self_attn.indexer.k_norm",
+ "model.layers.15.self_attn.indexer.k_norm.bias",
+ "model.layers.15.self_attn.indexers_proj",
+ "model.layers.15.self_attn.kv_a_layernorm",
+ "model.layers.15.self_attn.q_a_layernorm",
+ "model.layers.16.input_layernorm",
+ "model.layers.16.mlp.gate",
+ "model.layers.16.mlp.gate.e_score_correction_bias",
+ "model.layers.16.post_attention_layernorm",
+ "model.layers.16.self_attn.indexer.k_norm",
+ "model.layers.16.self_attn.indexer.k_norm.bias",
+ "model.layers.16.self_attn.indexers_proj",
+ "model.layers.16.self_attn.kv_a_layernorm",
+ "model.layers.16.self_attn.q_a_layernorm",
+ "model.layers.17.input_layernorm",
+ "model.layers.17.mlp.gate",
+ "model.layers.17.mlp.gate.e_score_correction_bias",
+ "model.layers.17.post_attention_layernorm",
+ "model.layers.17.self_attn.indexer.k_norm",
+ "model.layers.17.self_attn.indexer.k_norm.bias",
+ "model.layers.17.self_attn.indexers_proj",
+ "model.layers.17.self_attn.kv_a_layernorm",
+ "model.layers.17.self_attn.q_a_layernorm",
+ "model.layers.18.input_layernorm",
+ "model.layers.18.mlp.gate",
+ "model.layers.18.mlp.gate.e_score_correction_bias",
+ "model.layers.18.post_attention_layernorm",
+ "model.layers.18.self_attn.indexer.k_norm",
+ "model.layers.18.self_attn.indexer.k_norm.bias",
+ "model.layers.18.self_attn.indexers_proj",
+ "model.layers.18.self_attn.kv_a_layernorm",
+ "model.layers.18.self_attn.q_a_layernorm",
+ "model.layers.19.input_layernorm",
+ "model.layers.19.mlp.gate",
+ "model.layers.19.mlp.gate.e_score_correction_bias",
+ "model.layers.19.post_attention_layernorm",
+ "model.layers.19.self_attn.indexer.k_norm",
+ "model.layers.19.self_attn.indexer.k_norm.bias",
+ "model.layers.19.self_attn.indexers_proj",
+ "model.layers.19.self_attn.kv_a_layernorm",
+ "model.layers.19.self_attn.q_a_layernorm",
+ "model.layers.20.input_layernorm",
+ "model.layers.20.mlp.gate",
+ "model.layers.20.mlp.gate.e_score_correction_bias",
+ "model.layers.20.post_attention_layernorm",
+ "model.layers.20.self_attn.indexer.k_norm",
+ "model.layers.20.self_attn.indexer.k_norm.bias",
+ "model.layers.20.self_attn.indexers_proj",
+ "model.layers.20.self_attn.kv_a_layernorm",
+ "model.layers.20.self_attn.q_a_layernorm",
+ "model.layers.21.input_layernorm",
+ "model.layers.21.mlp.gate",
+ "model.layers.21.mlp.gate.e_score_correction_bias",
+ "model.layers.21.post_attention_layernorm",
+ "model.layers.21.self_attn.indexer.k_norm",
+ "model.layers.21.self_attn.indexer.k_norm.bias",
+ "model.layers.21.self_attn.indexers_proj",
+ "model.layers.21.self_attn.kv_a_layernorm",
+ "model.layers.21.self_attn.q_a_layernorm",
+ "model.layers.22.input_layernorm",
+ "model.layers.22.mlp.gate",
+ "model.layers.22.mlp.gate.e_score_correction_bias",
+ "model.layers.22.post_attention_layernorm",
+ "model.layers.22.self_attn.indexer.k_norm",
+ "model.layers.22.self_attn.indexer.k_norm.bias",
+ "model.layers.22.self_attn.indexers_proj",
+ "model.layers.22.self_attn.kv_a_layernorm",
+ "model.layers.22.self_attn.q_a_layernorm",
+ "model.layers.23.input_layernorm",
+ "model.layers.23.mlp.gate",
+ "model.layers.23.mlp.gate.e_score_correction_bias",
+ "model.layers.23.post_attention_layernorm",
+ "model.layers.23.self_attn.indexer.k_norm",
+ "model.layers.23.self_attn.indexer.k_norm.bias",
+ "model.layers.23.self_attn.indexers_proj",
+ "model.layers.23.self_attn.kv_a_layernorm",
+ "model.layers.23.self_attn.q_a_layernorm",
+ "model.layers.24.input_layernorm",
+ "model.layers.24.mlp.gate",
+ "model.layers.24.mlp.gate.e_score_correction_bias",
+ "model.layers.24.post_attention_layernorm",
+ "model.layers.24.self_attn.indexer.k_norm",
+ "model.layers.24.self_attn.indexer.k_norm.bias",
+ "model.layers.24.self_attn.indexers_proj",
+ "model.layers.24.self_attn.kv_a_layernorm",
+ "model.layers.24.self_attn.q_a_layernorm",
+ "model.layers.25.input_layernorm",
+ "model.layers.25.mlp.gate",
+ "model.layers.25.mlp.gate.e_score_correction_bias",
+ "model.layers.25.post_attention_layernorm",
+ "model.layers.25.self_attn.indexer.k_norm",
+ "model.layers.25.self_attn.indexer.k_norm.bias",
+ "model.layers.25.self_attn.indexers_proj",
+ "model.layers.25.self_attn.kv_a_layernorm",
+ "model.layers.25.self_attn.q_a_layernorm",
+ "model.layers.26.input_layernorm",
+ "model.layers.26.mlp.gate",
+ "model.layers.26.mlp.gate.e_score_correction_bias",
+ "model.layers.26.post_attention_layernorm",
+ "model.layers.26.self_attn.indexer.k_norm",
+ "model.layers.26.self_attn.indexer.k_norm.bias",
+ "model.layers.26.self_attn.indexers_proj",
+ "model.layers.26.self_attn.kv_a_layernorm",
+ "model.layers.26.self_attn.q_a_layernorm",
+ "model.layers.27.input_layernorm",
+ "model.layers.27.mlp.gate",
+ "model.layers.27.mlp.gate.e_score_correction_bias",
+ "model.layers.27.post_attention_layernorm",
+ "model.layers.27.self_attn.indexer.k_norm",
+ "model.layers.27.self_attn.indexer.k_norm.bias",
+ "model.layers.27.self_attn.indexers_proj",
+ "model.layers.27.self_attn.kv_a_layernorm",
+ "model.layers.27.self_attn.q_a_layernorm",
+ "model.layers.28.input_layernorm",
+ "model.layers.28.mlp.gate",
+ "model.layers.28.mlp.gate.e_score_correction_bias",
+ "model.layers.28.post_attention_layernorm",
+ "model.layers.28.self_attn.indexer.k_norm",
+ "model.layers.28.self_attn.indexer.k_norm.bias",
+ "model.layers.28.self_attn.indexers_proj",
+ "model.layers.28.self_attn.kv_a_layernorm",
+ "model.layers.28.self_attn.q_a_layernorm",
+ "model.layers.29.input_layernorm",
+ "model.layers.29.mlp.gate",
+ "model.layers.29.mlp.gate.e_score_correction_bias",
+ "model.layers.29.post_attention_layernorm",
+ "model.layers.29.self_attn.indexer.k_norm",
+ "model.layers.29.self_attn.indexer.k_norm.bias",
+ "model.layers.29.self_attn.indexers_proj",
+ "model.layers.29.self_attn.kv_a_layernorm",
+ "model.layers.29.self_attn.q_a_layernorm",
+ "model.layers.30.input_layernorm",
+ "model.layers.30.mlp.gate",
+ "model.layers.30.mlp.gate.e_score_correction_bias",
+ "model.layers.30.post_attention_layernorm",
+ "model.layers.30.self_attn.indexer.k_norm",
+ "model.layers.30.self_attn.indexer.k_norm.bias",
+ "model.layers.30.self_attn.indexers_proj",
+ "model.layers.30.self_attn.kv_a_layernorm",
+ "model.layers.30.self_attn.q_a_layernorm",
+ "model.layers.31.input_layernorm",
+ "model.layers.31.mlp.gate",
+ "model.layers.31.mlp.gate.e_score_correction_bias",
+ "model.layers.31.post_attention_layernorm",
+ "model.layers.31.self_attn.indexer.k_norm",
+ "model.layers.31.self_attn.indexer.k_norm.bias",
+ "model.layers.31.self_attn.indexers_proj",
+ "model.layers.31.self_attn.kv_a_layernorm",
+ "model.layers.31.self_attn.q_a_layernorm",
+ "model.layers.32.input_layernorm",
+ "model.layers.32.mlp.gate",
+ "model.layers.32.mlp.gate.e_score_correction_bias",
+ "model.layers.32.post_attention_layernorm",
+ "model.layers.32.self_attn.indexer.k_norm",
+ "model.layers.32.self_attn.indexer.k_norm.bias",
+ "model.layers.32.self_attn.indexers_proj",
+ "model.layers.32.self_attn.kv_a_layernorm",
+ "model.layers.32.self_attn.q_a_layernorm",
+ "model.layers.33.input_layernorm",
+ "model.layers.33.mlp.gate",
+ "model.layers.33.mlp.gate.e_score_correction_bias",
+ "model.layers.33.post_attention_layernorm",
+ "model.layers.33.self_attn.indexer.k_norm",
+ "model.layers.33.self_attn.indexer.k_norm.bias",
+ "model.layers.33.self_attn.indexers_proj",
+ "model.layers.33.self_attn.kv_a_layernorm",
+ "model.layers.33.self_attn.q_a_layernorm",
+ "model.layers.34.input_layernorm",
+ "model.layers.34.mlp.gate",
+ "model.layers.34.mlp.gate.e_score_correction_bias",
+ "model.layers.34.post_attention_layernorm",
+ "model.layers.34.self_attn.indexer.k_norm",
+ "model.layers.34.self_attn.indexer.k_norm.bias",
+ "model.layers.34.self_attn.indexers_proj",
+ "model.layers.34.self_attn.kv_a_layernorm",
+ "model.layers.34.self_attn.q_a_layernorm",
+ "model.layers.35.input_layernorm",
+ "model.layers.35.mlp.gate",
+ "model.layers.35.mlp.gate.e_score_correction_bias",
+ "model.layers.35.post_attention_layernorm",
+ "model.layers.35.self_attn.indexer.k_norm",
+ "model.layers.35.self_attn.indexer.k_norm.bias",
+ "model.layers.35.self_attn.indexers_proj",
+ "model.layers.35.self_attn.kv_a_layernorm",
+ "model.layers.35.self_attn.q_a_layernorm",
+ "model.layers.36.input_layernorm",
+ "model.layers.36.mlp.gate",
+ "model.layers.36.mlp.gate.e_score_correction_bias",
+ "model.layers.36.post_attention_layernorm",
+ "model.layers.36.self_attn.indexer.k_norm",
+ "model.layers.36.self_attn.indexer.k_norm.bias",
+ "model.layers.36.self_attn.indexers_proj",
+ "model.layers.36.self_attn.kv_a_layernorm",
+ "model.layers.36.self_attn.q_a_layernorm",
+ "model.layers.37.input_layernorm",
+ "model.layers.37.mlp.gate",
+ "model.layers.37.mlp.gate.e_score_correction_bias",
+ "model.layers.37.post_attention_layernorm",
+ "model.layers.37.self_attn.indexer.k_norm",
+ "model.layers.37.self_attn.indexer.k_norm.bias",
+ "model.layers.37.self_attn.indexers_proj",
+ "model.layers.37.self_attn.kv_a_layernorm",
+ "model.layers.37.self_attn.q_a_layernorm",
+ "model.layers.38.input_layernorm",
+ "model.layers.38.mlp.gate",
+ "model.layers.38.mlp.gate.e_score_correction_bias",
+ "model.layers.38.post_attention_layernorm",
+ "model.layers.38.self_attn.indexer.k_norm",
+ "model.layers.38.self_attn.indexer.k_norm.bias",
+ "model.layers.38.self_attn.indexers_proj",
+ "model.layers.38.self_attn.kv_a_layernorm",
+ "model.layers.38.self_attn.q_a_layernorm",
+ "model.layers.39.input_layernorm",
+ "model.layers.39.mlp.gate",
+ "model.layers.39.mlp.gate.e_score_correction_bias",
+ "model.layers.39.post_attention_layernorm",
+ "model.layers.39.self_attn.indexer.k_norm",
+ "model.layers.39.self_attn.indexer.k_norm.bias",
+ "model.layers.39.self_attn.indexers_proj",
+ "model.layers.39.self_attn.kv_a_layernorm",
+ "model.layers.39.self_attn.q_a_layernorm",
+ "model.layers.40.input_layernorm",
+ "model.layers.40.mlp.gate",
+ "model.layers.40.mlp.gate.e_score_correction_bias",
+ "model.layers.40.post_attention_layernorm",
+ "model.layers.40.self_attn.indexer.k_norm",
+ "model.layers.40.self_attn.indexer.k_norm.bias",
+ "model.layers.40.self_attn.indexers_proj",
+ "model.layers.40.self_attn.kv_a_layernorm",
+ "model.layers.40.self_attn.q_a_layernorm",
+ "model.layers.41.input_layernorm",
+ "model.layers.41.mlp.gate",
+ "model.layers.41.mlp.gate.e_score_correction_bias",
+ "model.layers.41.post_attention_layernorm",
+ "model.layers.41.self_attn.indexer.k_norm",
+ "model.layers.41.self_attn.indexer.k_norm.bias",
+ "model.layers.41.self_attn.indexers_proj",
+ "model.layers.41.self_attn.kv_a_layernorm",
+ "model.layers.41.self_attn.q_a_layernorm",
+ "model.layers.42.input_layernorm",
+ "model.layers.42.mlp.gate",
+ "model.layers.42.mlp.gate.e_score_correction_bias",
+ "model.layers.42.post_attention_layernorm",
+ "model.layers.42.self_attn.indexer.k_norm",
+ "model.layers.42.self_attn.indexer.k_norm.bias",
+ "model.layers.42.self_attn.indexers_proj",
+ "model.layers.42.self_attn.kv_a_layernorm",
+ "model.layers.42.self_attn.q_a_layernorm",
+ "model.layers.43.input_layernorm",
+ "model.layers.43.mlp.gate",
+ "model.layers.43.mlp.gate.e_score_correction_bias",
+ "model.layers.43.post_attention_layernorm",
+ "model.layers.43.self_attn.indexer.k_norm",
+ "model.layers.43.self_attn.indexer.k_norm.bias",
+ "model.layers.43.self_attn.indexers_proj",
+ "model.layers.43.self_attn.kv_a_layernorm",
+ "model.layers.43.self_attn.q_a_layernorm",
+ "model.layers.44.input_layernorm",
+ "model.layers.44.mlp.gate",
+ "model.layers.44.mlp.gate.e_score_correction_bias",
+ "model.layers.44.post_attention_layernorm",
+ "model.layers.44.self_attn.indexer.k_norm",
+ "model.layers.44.self_attn.indexer.k_norm.bias",
+ "model.layers.44.self_attn.indexers_proj",
+ "model.layers.44.self_attn.kv_a_layernorm",
+ "model.layers.44.self_attn.q_a_layernorm",
+ "model.layers.45.input_layernorm",
+ "model.layers.45.mlp.gate",
+ "model.layers.45.mlp.gate.e_score_correction_bias",
+ "model.layers.45.post_attention_layernorm",
+ "model.layers.45.self_attn.indexer.k_norm",
+ "model.layers.45.self_attn.indexer.k_norm.bias",
+ "model.layers.45.self_attn.indexers_proj",
+ "model.layers.45.self_attn.kv_a_layernorm",
+ "model.layers.45.self_attn.q_a_layernorm",
+ "model.layers.46.input_layernorm",
+ "model.layers.46.mlp.gate",
+ "model.layers.46.mlp.gate.e_score_correction_bias",
+ "model.layers.46.post_attention_layernorm",
+ "model.layers.46.self_attn.indexer.k_norm",
+ "model.layers.46.self_attn.indexer.k_norm.bias",
+ "model.layers.46.self_attn.indexers_proj",
+ "model.layers.46.self_attn.kv_a_layernorm",
+ "model.layers.46.self_attn.q_a_layernorm",
+ "model.layers.47.input_layernorm",
+ "model.layers.47.mlp.gate",
+ "model.layers.47.mlp.gate.e_score_correction_bias",
+ "model.layers.47.post_attention_layernorm",
+ "model.layers.47.self_attn.indexer.k_norm",
+ "model.layers.47.self_attn.indexer.k_norm.bias",
+ "model.layers.47.self_attn.indexers_proj",
+ "model.layers.47.self_attn.kv_a_layernorm",
+ "model.layers.47.self_attn.q_a_layernorm",
+ "model.layers.48.input_layernorm",
+ "model.layers.48.mlp.gate",
+ "model.layers.48.mlp.gate.e_score_correction_bias",
+ "model.layers.48.post_attention_layernorm",
+ "model.layers.48.self_attn.indexer.k_norm",
+ "model.layers.48.self_attn.indexer.k_norm.bias",
+ "model.layers.48.self_attn.indexers_proj",
+ "model.layers.48.self_attn.kv_a_layernorm",
+ "model.layers.48.self_attn.q_a_layernorm",
+ "model.layers.49.input_layernorm",
+ "model.layers.49.mlp.gate",
+ "model.layers.49.mlp.gate.e_score_correction_bias",
+ "model.layers.49.post_attention_layernorm",
+ "model.layers.49.self_attn.indexer.k_norm",
+ "model.layers.49.self_attn.indexer.k_norm.bias",
+ "model.layers.49.self_attn.indexers_proj",
+ "model.layers.49.self_attn.kv_a_layernorm",
+ "model.layers.49.self_attn.q_a_layernorm",
+ "model.layers.50.input_layernorm",
+ "model.layers.50.mlp.gate",
+ "model.layers.50.mlp.gate.e_score_correction_bias",
+ "model.layers.50.post_attention_layernorm",
+ "model.layers.50.self_attn.indexer.k_norm",
+ "model.layers.50.self_attn.indexer.k_norm.bias",
+ "model.layers.50.self_attn.indexers_proj",
+ "model.layers.50.self_attn.kv_a_layernorm",
+ "model.layers.50.self_attn.q_a_layernorm",
+ "model.layers.51.input_layernorm",
+ "model.layers.51.mlp.gate",
+ "model.layers.51.mlp.gate.e_score_correction_bias",
+ "model.layers.51.post_attention_layernorm",
+ "model.layers.51.self_attn.indexer.k_norm",
+ "model.layers.51.self_attn.indexer.k_norm.bias",
+ "model.layers.51.self_attn.indexers_proj",
+ "model.layers.51.self_attn.kv_a_layernorm",
+ "model.layers.51.self_attn.q_a_layernorm",
+ "model.layers.52.input_layernorm",
+ "model.layers.52.mlp.gate",
+ "model.layers.52.mlp.gate.e_score_correction_bias",
+ "model.layers.52.post_attention_layernorm",
+ "model.layers.52.self_attn.indexer.k_norm",
+ "model.layers.52.self_attn.indexer.k_norm.bias",
+ "model.layers.52.self_attn.indexers_proj",
+ "model.layers.52.self_attn.kv_a_layernorm",
+ "model.layers.52.self_attn.q_a_layernorm",
+ "model.layers.53.input_layernorm",
+ "model.layers.53.mlp.gate",
+ "model.layers.53.mlp.gate.e_score_correction_bias",
+ "model.layers.53.post_attention_layernorm",
+ "model.layers.53.self_attn.indexer.k_norm",
+ "model.layers.53.self_attn.indexer.k_norm.bias",
+ "model.layers.53.self_attn.indexers_proj",
+ "model.layers.53.self_attn.kv_a_layernorm",
+ "model.layers.53.self_attn.q_a_layernorm",
+ "model.layers.54.input_layernorm",
+ "model.layers.54.mlp.gate",
+ "model.layers.54.mlp.gate.e_score_correction_bias",
+ "model.layers.54.post_attention_layernorm",
+ "model.layers.54.self_attn.indexer.k_norm",
+ "model.layers.54.self_attn.indexer.k_norm.bias",
+ "model.layers.54.self_attn.indexers_proj",
+ "model.layers.54.self_attn.kv_a_layernorm",
+ "model.layers.54.self_attn.q_a_layernorm",
+ "model.layers.55.input_layernorm",
+ "model.layers.55.mlp.gate",
+ "model.layers.55.mlp.gate.e_score_correction_bias",
+ "model.layers.55.post_attention_layernorm",
+ "model.layers.55.self_attn.indexer.k_norm",
+ "model.layers.55.self_attn.indexer.k_norm.bias",
+ "model.layers.55.self_attn.indexers_proj",
+ "model.layers.55.self_attn.kv_a_layernorm",
+ "model.layers.55.self_attn.q_a_layernorm",
+ "model.layers.56.input_layernorm",
+ "model.layers.56.mlp.gate",
+ "model.layers.56.mlp.gate.e_score_correction_bias",
+ "model.layers.56.post_attention_layernorm",
+ "model.layers.56.self_attn.indexer.k_norm",
+ "model.layers.56.self_attn.indexer.k_norm.bias",
+ "model.layers.56.self_attn.indexers_proj",
+ "model.layers.56.self_attn.kv_a_layernorm",
+ "model.layers.56.self_attn.q_a_layernorm",
+ "model.layers.57.input_layernorm",
+ "model.layers.57.mlp.gate",
+ "model.layers.57.mlp.gate.e_score_correction_bias",
+ "model.layers.57.post_attention_layernorm",
+ "model.layers.57.self_attn.indexer.k_norm",
+ "model.layers.57.self_attn.indexer.k_norm.bias",
+ "model.layers.57.self_attn.indexers_proj",
+ "model.layers.57.self_attn.kv_a_layernorm",
+ "model.layers.57.self_attn.q_a_layernorm",
+ "model.layers.58.input_layernorm",
+ "model.layers.58.mlp.gate",
+ "model.layers.58.mlp.gate.e_score_correction_bias",
+ "model.layers.58.post_attention_layernorm",
+ "model.layers.58.self_attn.indexer.k_norm",
+ "model.layers.58.self_attn.indexer.k_norm.bias",
+ "model.layers.58.self_attn.indexers_proj",
+ "model.layers.58.self_attn.kv_a_layernorm",
+ "model.layers.58.self_attn.q_a_layernorm",
+ "model.layers.59.input_layernorm",
+ "model.layers.59.mlp.gate",
+ "model.layers.59.mlp.gate.e_score_correction_bias",
+ "model.layers.59.post_attention_layernorm",
+ "model.layers.59.self_attn.indexer.k_norm",
+ "model.layers.59.self_attn.indexer.k_norm.bias",
+ "model.layers.59.self_attn.indexers_proj",
+ "model.layers.59.self_attn.kv_a_layernorm",
+ "model.layers.59.self_attn.q_a_layernorm",
+ "model.layers.60.input_layernorm",
+ "model.layers.60.mlp.gate",
+ "model.layers.60.mlp.gate.e_score_correction_bias",
+ "model.layers.60.post_attention_layernorm",
+ "model.layers.60.self_attn.indexer.k_norm",
+ "model.layers.60.self_attn.indexer.k_norm.bias",
+ "model.layers.60.self_attn.indexers_proj",
+ "model.layers.60.self_attn.kv_a_layernorm",
+ "model.layers.60.self_attn.q_a_layernorm",
+ "model.layers.61.input_layernorm",
+ "model.layers.61.mlp.gate",
+ "model.layers.61.mlp.gate.e_score_correction_bias",
+ "model.layers.61.post_attention_layernorm",
+ "model.layers.61.self_attn.indexer.k_norm",
+ "model.layers.61.self_attn.indexer.k_norm.bias",
+ "model.layers.61.self_attn.indexers_proj",
+ "model.layers.61.self_attn.kv_a_layernorm",
+ "model.layers.61.self_attn.q_a_layernorm",
+ "model.layers.62.input_layernorm",
+ "model.layers.62.mlp.gate",
+ "model.layers.62.mlp.gate.e_score_correction_bias",
+ "model.layers.62.post_attention_layernorm",
+ "model.layers.62.self_attn.indexer.k_norm",
+ "model.layers.62.self_attn.indexer.k_norm.bias",
+ "model.layers.62.self_attn.indexers_proj",
+ "model.layers.62.self_attn.kv_a_layernorm",
+ "model.layers.62.self_attn.q_a_layernorm",
+ "model.layers.63.input_layernorm",
+ "model.layers.63.mlp.gate",
+ "model.layers.63.mlp.gate.e_score_correction_bias",
+ "model.layers.63.post_attention_layernorm",
+ "model.layers.63.self_attn.indexer.k_norm",
+ "model.layers.63.self_attn.indexer.k_norm.bias",
+ "model.layers.63.self_attn.indexers_proj",
+ "model.layers.63.self_attn.kv_a_layernorm",
+ "model.layers.63.self_attn.q_a_layernorm",
+ "model.layers.64.input_layernorm",
+ "model.layers.64.mlp.gate",
+ "model.layers.64.mlp.gate.e_score_correction_bias",
+ "model.layers.64.post_attention_layernorm",
+ "model.layers.64.self_attn.indexer.k_norm",
+ "model.layers.64.self_attn.indexer.k_norm.bias",
+ "model.layers.64.self_attn.indexers_proj",
+ "model.layers.64.self_attn.kv_a_layernorm",
+ "model.layers.64.self_attn.q_a_layernorm",
+ "model.layers.65.input_layernorm",
+ "model.layers.65.mlp.gate",
+ "model.layers.65.mlp.gate.e_score_correction_bias",
+ "model.layers.65.post_attention_layernorm",
+ "model.layers.65.self_attn.indexer.k_norm",
+ "model.layers.65.self_attn.indexer.k_norm.bias",
+ "model.layers.65.self_attn.indexers_proj",
+ "model.layers.65.self_attn.kv_a_layernorm",
+ "model.layers.65.self_attn.q_a_layernorm",
+ "model.layers.66.input_layernorm",
+ "model.layers.66.mlp.gate",
+ "model.layers.66.mlp.gate.e_score_correction_bias",
+ "model.layers.66.post_attention_layernorm",
+ "model.layers.66.self_attn.indexer.k_norm",
+ "model.layers.66.self_attn.indexer.k_norm.bias",
+ "model.layers.66.self_attn.indexers_proj",
+ "model.layers.66.self_attn.kv_a_layernorm",
+ "model.layers.66.self_attn.q_a_layernorm",
+ "model.layers.67.input_layernorm",
+ "model.layers.67.mlp.gate",
+ "model.layers.67.mlp.gate.e_score_correction_bias",
+ "model.layers.67.post_attention_layernorm",
+ "model.layers.67.self_attn.indexer.k_norm",
+ "model.layers.67.self_attn.indexer.k_norm.bias",
+ "model.layers.67.self_attn.indexers_proj",
+ "model.layers.67.self_attn.kv_a_layernorm",
+ "model.layers.67.self_attn.q_a_layernorm",
+ "model.layers.68.input_layernorm",
+ "model.layers.68.mlp.gate",
+ "model.layers.68.mlp.gate.e_score_correction_bias",
+ "model.layers.68.post_attention_layernorm",
+ "model.layers.68.self_attn.indexer.k_norm",
+ "model.layers.68.self_attn.indexer.k_norm.bias",
+ "model.layers.68.self_attn.indexers_proj",
+ "model.layers.68.self_attn.kv_a_layernorm",
+ "model.layers.68.self_attn.q_a_layernorm",
+ "model.layers.69.input_layernorm",
+ "model.layers.69.mlp.gate",
+ "model.layers.69.mlp.gate.e_score_correction_bias",
+ "model.layers.69.post_attention_layernorm",
+ "model.layers.69.self_attn.indexer.k_norm",
+ "model.layers.69.self_attn.indexer.k_norm.bias",
+ "model.layers.69.self_attn.indexers_proj",
+ "model.layers.69.self_attn.kv_a_layernorm",
+ "model.layers.69.self_attn.q_a_layernorm",
+ "model.layers.70.input_layernorm",
+ "model.layers.70.mlp.gate",
+ "model.layers.70.mlp.gate.e_score_correction_bias",
+ "model.layers.70.post_attention_layernorm",
+ "model.layers.70.self_attn.indexer.k_norm",
+ "model.layers.70.self_attn.indexer.k_norm.bias",
+ "model.layers.70.self_attn.indexers_proj",
+ "model.layers.70.self_attn.kv_a_layernorm",
+ "model.layers.70.self_attn.q_a_layernorm",
+ "model.layers.71.input_layernorm",
+ "model.layers.71.mlp.gate",
+ "model.layers.71.mlp.gate.e_score_correction_bias",
+ "model.layers.71.post_attention_layernorm",
+ "model.layers.71.self_attn.indexer.k_norm",
+ "model.layers.71.self_attn.indexer.k_norm.bias",
+ "model.layers.71.self_attn.indexers_proj",
+ "model.layers.71.self_attn.kv_a_layernorm",
+ "model.layers.71.self_attn.q_a_layernorm",
+ "model.layers.72.input_layernorm",
+ "model.layers.72.mlp.gate",
+ "model.layers.72.mlp.gate.e_score_correction_bias",
+ "model.layers.72.post_attention_layernorm",
+ "model.layers.72.self_attn.indexer.k_norm",
+ "model.layers.72.self_attn.indexer.k_norm.bias",
+ "model.layers.72.self_attn.indexers_proj",
+ "model.layers.72.self_attn.kv_a_layernorm",
+ "model.layers.72.self_attn.q_a_layernorm",
+ "model.layers.73.input_layernorm",
+ "model.layers.73.mlp.gate",
+ "model.layers.73.mlp.gate.e_score_correction_bias",
+ "model.layers.73.post_attention_layernorm",
+ "model.layers.73.self_attn.indexer.k_norm",
+ "model.layers.73.self_attn.indexer.k_norm.bias",
+ "model.layers.73.self_attn.indexers_proj",
+ "model.layers.73.self_attn.kv_a_layernorm",
+ "model.layers.73.self_attn.q_a_layernorm",
+ "model.layers.74.input_layernorm",
+ "model.layers.74.mlp.gate",
+ "model.layers.74.mlp.gate.e_score_correction_bias",
+ "model.layers.74.post_attention_layernorm",
+ "model.layers.74.self_attn.indexer.k_norm",
+ "model.layers.74.self_attn.indexer.k_norm.bias",
+ "model.layers.74.self_attn.indexers_proj",
+ "model.layers.74.self_attn.kv_a_layernorm",
+ "model.layers.74.self_attn.q_a_layernorm",
+ "model.layers.75.input_layernorm",
+ "model.layers.75.mlp.gate",
+ "model.layers.75.mlp.gate.e_score_correction_bias",
+ "model.layers.75.post_attention_layernorm",
+ "model.layers.75.self_attn.indexer.k_norm",
+ "model.layers.75.self_attn.indexer.k_norm.bias",
+ "model.layers.75.self_attn.indexers_proj",
+ "model.layers.75.self_attn.kv_a_layernorm",
+ "model.layers.75.self_attn.q_a_layernorm",
+ "model.layers.76.input_layernorm",
+ "model.layers.76.mlp.gate",
+ "model.layers.76.mlp.gate.e_score_correction_bias",
+ "model.layers.76.post_attention_layernorm",
+ "model.layers.76.self_attn.indexer.k_norm",
+ "model.layers.76.self_attn.indexer.k_norm.bias",
+ "model.layers.76.self_attn.indexers_proj",
+ "model.layers.76.self_attn.kv_a_layernorm",
+ "model.layers.76.self_attn.q_a_layernorm",
+ "model.layers.77.input_layernorm",
+ "model.layers.77.mlp.gate",
+ "model.layers.77.mlp.gate.e_score_correction_bias",
+ "model.layers.77.post_attention_layernorm",
+ "model.layers.77.self_attn.indexer.k_norm",
+ "model.layers.77.self_attn.indexer.k_norm.bias",
+ "model.layers.77.self_attn.indexers_proj",
+ "model.layers.77.self_attn.kv_a_layernorm",
+ "model.layers.77.self_attn.q_a_layernorm",
+ "model.layers.78.eh_proj",
+ "model.layers.78.enorm",
+ "model.layers.78.hnorm",
+ "model.layers.78.input_layernorm",
+ "model.layers.78.mlp.gate",
+ "model.layers.78.mlp.gate.e_score_correction_bias",
+ "model.layers.78.post_attention_layernorm",
+ "model.layers.78.self_attn.indexer.k_norm",
+ "model.layers.78.self_attn.indexer.k_norm.bias",
+ "model.layers.78.self_attn.indexers_proj",
+ "model.layers.78.self_attn.kv_a_layernorm",
+ "model.layers.78.self_attn.q_a_layernorm",
+ "model.layers.78.shared_head.norm",
+ "model.norm"
+ ],
+ "quant_method": "fp8",
+ "weight_block_size": [
+ 128,
+ 128
+ ]
+ },
+ "rms_norm_eps": 1e-05,
+ "rope_interleave": true,
+ "rope_parameters": {
+ "rope_theta": 1000000,
+ "rope_type": "default"
+ },
+ "routed_scaling_factor": 2.5,
+ "scoring_func": "sigmoid",
+ "tie_word_embeddings": false,
+ "topk_group": 1,
+ "topk_method": "noaux_tc",
+ "transformers_version": "5.6.0.dev0",
+ "unsloth_fixed": true,
+ "use_cache": true,
+ "v_head_dim": 256,
+ "vocab_size": 154880
+}
\ No newline at end of file
diff --git a/generation_config.json b/generation_config.json
new file mode 100644
index 0000000000000000000000000000000000000000..453800a061bdc65b75b9dd99ecc66ede543dac89
--- /dev/null
+++ b/generation_config.json
@@ -0,0 +1,12 @@
+{
+ "_from_model_config": true,
+ "eos_token_id": [
+ 154820,
+ 154827,
+ 154829
+ ],
+ "pad_token_id": 154820,
+ "temperature": 1.0,
+ "top_p": 0.95,
+ "transformers_version": "5.4.0"
+}
diff --git a/model-00001-of-00142.safetensors b/model-00001-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..b7e90e0a4f6e701b4ff07944034d726196ccdc01
--- /dev/null
+++ b/model-00001-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:205976fdd1cddd00be3bc1e20755a163054e225af995a72f6e4149c018204be4
+size 5363940952
diff --git a/model-00002-of-00142.safetensors b/model-00002-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..7b2d2ae9340471957f051f89894ca1206a5aacd4
--- /dev/null
+++ b/model-00002-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d909a059ea5207282552fe5f5e4a4d411da68cf4644057ba1cc7a329ebfb2b9a
+size 5361736696
diff --git a/model-00003-of-00142.safetensors b/model-00003-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..c741fd78285d7bbdd74be6602a366c92a6771035
--- /dev/null
+++ b/model-00003-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e1c708c55c71b2f537917ab7144ca1334f365724869662713439fe16e910582b
+size 5363339120
diff --git a/model-00004-of-00142.safetensors b/model-00004-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..8537ddabac41a3795da0debcff159075e95e10b8
--- /dev/null
+++ b/model-00004-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c8405236f949cd74c17af1d3ba65818f731e3dce4112a3760bb55ab808d86cae
+size 5361736640
diff --git a/model-00005-of-00142.safetensors b/model-00005-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..d02f375bae0c75707244b0612ab5a0623180c30a
--- /dev/null
+++ b/model-00005-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9f76ae761b928c7e905748b3a5cb914528784ea0c6d4e8d84d3ebbd384998f62
+size 5363339176
diff --git a/model-00006-of-00142.safetensors b/model-00006-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..6c17f81439a80aeb5e6595e6253857be14c989b7
--- /dev/null
+++ b/model-00006-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:eec9fcc3a36b3cd7df973960072f42b1e2fa93094cd9e798ea8b857ff9ba8bae
+size 5361736504
diff --git a/model-00007-of-00142.safetensors b/model-00007-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..c212cf0c289df3bc65460d46dd78d85c5122cf1a
--- /dev/null
+++ b/model-00007-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:80205be2a99e07bc102228bf4b36b64d0ba1c175a75bd2905a83bbdd59bda06c
+size 5363339304
diff --git a/model-00008-of-00142.safetensors b/model-00008-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..112301ca85b876f0dfea40dc44f762a0f10d37d2
--- /dev/null
+++ b/model-00008-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c3778a0c3bcaca60f56c44fbc4a87f60bfab72323bdf8f2cabd255ec7eb24ff7
+size 5361736368
diff --git a/model-00009-of-00142.safetensors b/model-00009-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..6dbf362f70d0234045edcfb811bd4a29a6412f55
--- /dev/null
+++ b/model-00009-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e2a4ad3a27c1863f32611e297b7ad85ddf229237d3aef3abff5895be59ef2ef4
+size 5363339440
diff --git a/model-00010-of-00142.safetensors b/model-00010-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..f4baa748b2e187d7dfd8dc0dc082779a9d569f50
--- /dev/null
+++ b/model-00010-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:46449f548211661cb5c3845f16501b4250b9fae313914f9b83b31a365ed498d6
+size 5361736232
diff --git a/model-00011-of-00142.safetensors b/model-00011-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..6ea5e3ba7b1597b4a1b5e20e802b13644f6e49d1
--- /dev/null
+++ b/model-00011-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:36557532935d5edf38aef425de1494f51039f61b34d2f0a39aabf285bbfa7752
+size 5363339592
diff --git a/model-00012-of-00142.safetensors b/model-00012-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..914d079dcfe658d046083070097a12cca842639e
--- /dev/null
+++ b/model-00012-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:36c6ff15d39f1355bbda9802a595ef91262495f1d8381d8634bc2d7468d82509
+size 5363339104
diff --git a/model-00013-of-00142.safetensors b/model-00013-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..919a56f0a1cd41097211ddad80e1e98bfd6afe9f
--- /dev/null
+++ b/model-00013-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1722fbe67d4bfa26442617837f908445a272e269b26219791e7509e71f3041fb
+size 5361736696
diff --git a/model-00014-of-00142.safetensors b/model-00014-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..8a4baa29bd46f559cfba2b47c6db5daf31902a6c
--- /dev/null
+++ b/model-00014-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0a4028b398d2e09e362dc360fa5fb536cafc80fa9326fd3a5f43a867cc937704
+size 5363339120
diff --git a/model-00015-of-00142.safetensors b/model-00015-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..b5171cbe83f98732137f02873f2ae95252a8f71c
--- /dev/null
+++ b/model-00015-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:916305b18ec04fd399ec4f197bf29c7600df868f974327ead4280bd6c9004352
+size 5361736688
diff --git a/model-00016-of-00142.safetensors b/model-00016-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..675603373fdcfa217829fbb800c391aaa7e35d15
--- /dev/null
+++ b/model-00016-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:00444ce81f597e639561a3fa0126cdeaefc8122ae25bd14cce949b260861e554
+size 5363339128
diff --git a/model-00017-of-00142.safetensors b/model-00017-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..798563ea0ad56a217c1853ff24f3969961809b8f
--- /dev/null
+++ b/model-00017-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:345af0570dc88a2ae4cceae6cd99e52ad6d149b619f49ee18bfce0aaad037838
+size 5361736552
diff --git a/model-00018-of-00142.safetensors b/model-00018-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..319d97d022402e8ac01437df4c709a5852dc86d6
--- /dev/null
+++ b/model-00018-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:69187f3011e235f55a47143b7b9998836acca7550e8924047e58b0f35ab6d554
+size 5363339256
diff --git a/model-00019-of-00142.safetensors b/model-00019-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..b3584578c385c922bdea5a6a7d3f9015ef04d698
--- /dev/null
+++ b/model-00019-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9fe4da41bf7287e53e77fa66ad7253b0189509b9fc964f00dffe77810c5fff19
+size 5361736416
diff --git a/model-00020-of-00142.safetensors b/model-00020-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..2c6181d3809a4563136e095e5e5de193c7eaa908
--- /dev/null
+++ b/model-00020-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ffbd31d3787cbb991c7026c263490206b84084f48f555fd1f5e5076ae75b1b5b
+size 5361791448
diff --git a/model-00021-of-00142.safetensors b/model-00021-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..cde8d8bf22464427d8f0f74d1488303b60c49928
--- /dev/null
+++ b/model-00021-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2da172eb3d7dcdb22196bef3e5d2f7997e27975007f32072c26e9b0cebdf3c27
+size 5361736352
diff --git a/model-00022-of-00142.safetensors b/model-00022-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..eaafa17318b0e6fb9004d88e8cf123442424a414
--- /dev/null
+++ b/model-00022-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6f8903dd5a119b301d23e37aa56b6c91b9c144b31d863a26af59177c9afab4dd
+size 5363339456
diff --git a/model-00023-of-00142.safetensors b/model-00023-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..749ce3ce2b38c314d0f2353beb5e0a4211d5fa19
--- /dev/null
+++ b/model-00023-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:66c41ca799608fa422b382f31ea56d43b97872de552b1dada68a4d5ed6be1d28
+size 5361736224
diff --git a/model-00024-of-00142.safetensors b/model-00024-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..fd9465cb6ee924062ead3ef23bc055953b04300c
--- /dev/null
+++ b/model-00024-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2d8e543d1361c6504cf9f7cf937210cac1d7772c7f568bec5aac7f38e3e4aa30
+size 5363339608
diff --git a/model-00025-of-00142.safetensors b/model-00025-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..c067a11b78492dd1aeebb307286830707c604dec
--- /dev/null
+++ b/model-00025-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4939a8045fc5619d72adf41d638107b2270845adc2862d8ec0cce349ded7c54c
+size 5363339104
diff --git a/model-00026-of-00142.safetensors b/model-00026-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..d9ae22db3eb471906c0d6f18a7d3768c3bd9134a
--- /dev/null
+++ b/model-00026-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1ece8ece94c8e90d459951fc1e5acadbc744fb6c2fb06ceafa3ac3b829db25ab
+size 5361736696
diff --git a/model-00027-of-00142.safetensors b/model-00027-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..3213842d0e2be27fbf55c206566513cbcedfa84a
--- /dev/null
+++ b/model-00027-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ad676bf742023e409af3722d70d097fc257991bdb0bd04bed8f17397b7def3c6
+size 5363339112
diff --git a/model-00028-of-00142.safetensors b/model-00028-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..fff6cfc49384ffaa6122d35cfa5e1d417d9428c1
--- /dev/null
+++ b/model-00028-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5f37bad9f196213cd05081cf2d2a466f4dabaa22c594eff452933fcd5ab3dc64
+size 5361736664
diff --git a/model-00029-of-00142.safetensors b/model-00029-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..9a8190eeeedafecb38b82dc4a6342d261fcebe00
--- /dev/null
+++ b/model-00029-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3b49905f4405d2e7a70ca1b4af0851350c86e685f982a0570c600d11dd802880
+size 5363339152
diff --git a/model-00030-of-00142.safetensors b/model-00030-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..ff3ce9069759e853c25e1a478bf3a9762b5eae5a
--- /dev/null
+++ b/model-00030-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e7b654485f9c01c86621f6cb024e1f6bdd81300011b6b9b11275c3fa7982472d
+size 5361736528
diff --git a/model-00031-of-00142.safetensors b/model-00031-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..02184f70dcb978c70d0558989f2779a06f45da89
--- /dev/null
+++ b/model-00031-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:34d78a779bd69810b5cb6467dcaa8705c6a88d8c2e48e8408881ffd2dcbd4a46
+size 5363339280
diff --git a/model-00032-of-00142.safetensors b/model-00032-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..8242f4034b83c2fcd10ed0948f413b40587163be
--- /dev/null
+++ b/model-00032-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:995e8d62d59d3dc544864b1f3d64f43d698acb411bd9f001993437fe8ffc99bb
+size 5361736392
diff --git a/model-00033-of-00142.safetensors b/model-00033-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..565670e96ac215ad193129f074752d730e1323d7
--- /dev/null
+++ b/model-00033-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:98b8ed708b1c6c14283c378440bb687f71c2a4612ab35e910169a02e4678d6f7
+size 5363339416
diff --git a/model-00034-of-00142.safetensors b/model-00034-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..e2d30be2b538118a79d7cfb9831b58c7005889f8
--- /dev/null
+++ b/model-00034-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c2d5253375958275caa3525b9a5a8bdca3a75cea15b98b36747fc4931777e7cc
+size 5361736264
diff --git a/model-00035-of-00142.safetensors b/model-00035-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..ac175390b54ef2cd9c91a43099d8de65225ba3de
--- /dev/null
+++ b/model-00035-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:17a2d75e9d0598413de333185465867a1e1722441266a449c421e09a4b3bf813
+size 5363339544
diff --git a/model-00036-of-00142.safetensors b/model-00036-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..9ee4fb34d3e0ee85279c095a5814e58fa7f15399
--- /dev/null
+++ b/model-00036-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:441ae5c24d32a8b674dd8fb1db1d4835ef46bd470c9d2f096d86b5031a3fcb41
+size 5363339104
diff --git a/model-00037-of-00142.safetensors b/model-00037-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..8b3e1cb91e598bfca27f5515a5920e0dcc734d9a
--- /dev/null
+++ b/model-00037-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9bf844f6bee957fc203431c39b0a5eb9f104258c170e4e5490f1ea8487d593b6
+size 5361736696
diff --git a/model-00038-of-00142.safetensors b/model-00038-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..3d3d35b71b0e149018abe027d03deab1c42a9246
--- /dev/null
+++ b/model-00038-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1ba412f5ccc524070eb35272438a2f549bb7165cb4b7b86530955bf281fe8ea2
+size 5363338920
diff --git a/model-00039-of-00142.safetensors b/model-00039-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..ac8757c7b5da629993d2607f7f1c01b50b2879e2
--- /dev/null
+++ b/model-00039-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5ebc73ae8cfce5ebcaf759962748d66b5010beccacd1f77b02e45d300c36c2b8
+size 5361735840
diff --git a/model-00040-of-00142.safetensors b/model-00040-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..ca5935b6e88caddfa99ebf3261f39b46d7343ec3
--- /dev/null
+++ b/model-00040-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:05e6033cf198187e3a89fc90f6c9364712056b73669ba853e442a4bff812e695
+size 5363338584
diff --git a/model-00041-of-00142.safetensors b/model-00041-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..d76316c61c4573de4400c79dbe94d36f005b4443
--- /dev/null
+++ b/model-00041-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b96b36ed04a5750c16a09df81f01cc396ca29bd7b50605990fcfab083b1d70c9
+size 5361736576
diff --git a/model-00042-of-00142.safetensors b/model-00042-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..cc871c050419073237cab13ba3f0353c6d2618f2
--- /dev/null
+++ b/model-00042-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c32899afc42f70a40d48640f15991f55a691ab04e49af60844cf476fb9e65ca9
+size 5363339232
diff --git a/model-00043-of-00142.safetensors b/model-00043-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..19d5f9f3f374a334a3a7171371b5b309c1cce13f
--- /dev/null
+++ b/model-00043-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:91036a2dbf0f882d0ca1a39db1e8735d40ba5e689f3cc06e0afb859cee5b288c
+size 5361736440
diff --git a/model-00044-of-00142.safetensors b/model-00044-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..95a70457fe7b0980e04445915e025f08d4e92495
--- /dev/null
+++ b/model-00044-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7763ff9d8d8f85e0d9430a1688734744d5c5d4b46224f3b0cc555529f5d0a529
+size 5363339368
diff --git a/model-00045-of-00142.safetensors b/model-00045-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..44d758f89e7e499af530fd41e0d93b45ff42e151
--- /dev/null
+++ b/model-00045-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a3e1fb4ed18f8fd9b3a89ececacbdfaa2af3b0f61a4550ebb6a31ce600b4b834
+size 5361736312
diff --git a/model-00046-of-00142.safetensors b/model-00046-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..8e67b9ddc0b7448cb83929a8a6efdf72ea493205
--- /dev/null
+++ b/model-00046-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8bcf2b2b8be75a1f7552bd0c164f6fd7973f8b5b77557cf657272b7080c4590b
+size 5363339504
diff --git a/model-00047-of-00142.safetensors b/model-00047-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..e8266e38c1cd9753f42bdde1a72e856de3bd1177
--- /dev/null
+++ b/model-00047-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f464b9c18529a519c45f2c15c668b472b8609f4d7ca1979b92a8c08480f8d434
+size 5342350104
diff --git a/model-00048-of-00142.safetensors b/model-00048-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..147e368edfbceed7fcc102d83ddfd19f9908a9bf
--- /dev/null
+++ b/model-00048-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:00e90eae66e53cddfd1552a5ddfd8bf79bf9e33b1914f67ba8871ba3f15831cd
+size 5357553232
diff --git a/model-00049-of-00142.safetensors b/model-00049-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..562e09d7776d9581acb2488910a40ee627f7a57b
--- /dev/null
+++ b/model-00049-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a25f7a4431fd3dcb43e7e8c9097112100db23319a2c1126f0003b9fcabc28a1b
+size 5363339104
diff --git a/model-00050-of-00142.safetensors b/model-00050-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..16ec787fe30235d41130871a2b70aebdd4fde1b2
--- /dev/null
+++ b/model-00050-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b79ea4649cecb0a980b77c0bac2a6e7232c60803ff64d880d35f4856d2ee9aee
+size 5361736696
diff --git a/model-00051-of-00142.safetensors b/model-00051-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..6e25b499b9196fab120d4bd740161b6cb862cb6c
--- /dev/null
+++ b/model-00051-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:26634691beece96dbe843e7017b012438f366ab3fed557bae907c1acadfe4f33
+size 5363339120
diff --git a/model-00052-of-00142.safetensors b/model-00052-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..f531dcf6ba8474ed3cb597bfae48d7cc14b582e7
--- /dev/null
+++ b/model-00052-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:66ed6313b28d80d1e6fcb3ca0fbd26a0a2a75f86e3160aca543802941449b04c
+size 5361736632
diff --git a/model-00053-of-00142.safetensors b/model-00053-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..2a91c0caddecf1c35a2eacfedd0fa692fe5dd247
--- /dev/null
+++ b/model-00053-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:81c4edd555eb0f500955c8820fbcf01f03d080da1f747d4227db73a4ce637cea
+size 5363339176
diff --git a/model-00054-of-00142.safetensors b/model-00054-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..e3862fcd39c39887b94482e63aca4ed7c85d0f16
--- /dev/null
+++ b/model-00054-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cc34d07e8b48f9ac2b761044915ce1c54c904b1cf8830d4f89c99c27743c0962
+size 5361736496
diff --git a/model-00055-of-00142.safetensors b/model-00055-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..dd7690536bce449c1594d66dabfb7ff079eb85b5
--- /dev/null
+++ b/model-00055-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e31f426911d19731e6f7052c58c7ef36506b509b689934d3a5a9366147949155
+size 5363339320
diff --git a/model-00056-of-00142.safetensors b/model-00056-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..6838f4f7c063b71cbd79e7b2798cf60a74863814
--- /dev/null
+++ b/model-00056-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ae8b17b5669adaf23464dbce9871a9240c7e811d0142979952ef4d7d1bfeaf7b
+size 5361736360
diff --git a/model-00057-of-00142.safetensors b/model-00057-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..270b31265880e4b653f960eb9e5234ba5fe73598
--- /dev/null
+++ b/model-00057-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:12a3eda892b07bca6d281c2892c8e0afa9d3cd372bbadb7a32ea53c867c31f04
+size 5363339448
diff --git a/model-00058-of-00142.safetensors b/model-00058-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..aa56ad207add4a86470a73e30f885e7d34142738
--- /dev/null
+++ b/model-00058-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1dacb2505271a4f1429ccb04b4d220d2b574f1beda79c2c1544fd34788149318
+size 5361736224
diff --git a/model-00059-of-00142.safetensors b/model-00059-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..d8e2739fdead7b28a727ee4611559f8751b24327
--- /dev/null
+++ b/model-00059-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3de72a77ea958ec1cb91f8a9e0adf0248fbb5646c390dc591c8cfeb6f08121d1
+size 5363338800
diff --git a/model-00060-of-00142.safetensors b/model-00060-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..fb115ec1d8d7b30163812d61f0d4dd14989c4dd0
--- /dev/null
+++ b/model-00060-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a0430b0f5b1b2381a8f6002ed5f16a0a8a46ffde2690fda4113e776d3d71b014
+size 5363338336
diff --git a/model-00061-of-00142.safetensors b/model-00061-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..ce9ab5da4db7a2f562c8a34253624b0a3579a8c5
--- /dev/null
+++ b/model-00061-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b53485d2cb922ac2a4d4e9936c50683448739e2a5cb9b3cf9c57e60498a3bae9
+size 5361736696
diff --git a/model-00062-of-00142.safetensors b/model-00062-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..4f0f2bffc0074769bad8ce4306c7e300d7ab7711
--- /dev/null
+++ b/model-00062-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b5432981d1b2516aa93c8506bb59721cdb3c6dbdb5095eb32804d06e90ffd485
+size 5363339112
diff --git a/model-00063-of-00142.safetensors b/model-00063-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..3d63f569b9fe4013373693c14aba55c5bc70cb11
--- /dev/null
+++ b/model-00063-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:108c6e36bd436874a1be0d84657e5964231a53b2acf41b9f5885123e3d0f754a
+size 5361736680
diff --git a/model-00064-of-00142.safetensors b/model-00064-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..aee02dc281cead3224ac61fb29ccdb3f27387e07
--- /dev/null
+++ b/model-00064-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:335a18eb4faa9a2e4b0c0425d75322b47f7b9808ca48a9c03dce6bde87816942
+size 5363339128
diff --git a/model-00065-of-00142.safetensors b/model-00065-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..a903bd491090396b0561fb32cd1015412cb701f2
--- /dev/null
+++ b/model-00065-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:caa1fcc2ba6c7f7439777362590c65c048c4211dbb6f01b462d047b0fbe97591
+size 5361736552
diff --git a/model-00066-of-00142.safetensors b/model-00066-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..7a05fe622f0841d0d4b4c31272ba2c9484e2fa68
--- /dev/null
+++ b/model-00066-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e85116f997e723fc023800b3d600ad07873bb5ee23a50ee088a099d0e958aa83
+size 5363339264
diff --git a/model-00067-of-00142.safetensors b/model-00067-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..0c293b80737ca2cb30e7974d3f314ab3fae64625
--- /dev/null
+++ b/model-00067-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1ed2d941d5c2dc603e4c80bae92aa8845a8b5b605526da829ed7ab48f7021142
+size 5361736416
diff --git a/model-00068-of-00142.safetensors b/model-00068-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..b7df95ed9ac31e477277c6ac7fa5009365aac458
--- /dev/null
+++ b/model-00068-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e846e559c9a26f01f44ddfcfa8f433d55a637e530a8f298d95dd357220ec48d7
+size 5363339400
diff --git a/model-00069-of-00142.safetensors b/model-00069-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..c1f1024344cf1ae2f7e6db0b398b5e984ac180e1
--- /dev/null
+++ b/model-00069-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:907b38c71f3345fa14fad3fc94b5a27b32d0fb77369c2bf916075b42730ad73d
+size 5361736280
diff --git a/model-00070-of-00142.safetensors b/model-00070-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..2ef2cc8407db040c6f9fd69432d4833d5977a350
--- /dev/null
+++ b/model-00070-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9fc0fa844bc418f51bf62b777a56f1f95b94b1b2d93308ff014e80aa591b7e42
+size 5363339528
diff --git a/model-00071-of-00142.safetensors b/model-00071-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..9806c74c6b2fce004aba6619703f44f6d6dd34bf
--- /dev/null
+++ b/model-00071-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:34ccdeb511e70cdfd729f4a8ebb55ef9c54a72d9e52c5bd16a92666aea921456
+size 5363339104
diff --git a/model-00072-of-00142.safetensors b/model-00072-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..039d0aec57e9f5c0bf2c4d4dfe369a656d60b90c
--- /dev/null
+++ b/model-00072-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6408c8c9a09d9dc4264e1984bc3add92281270c731f64d91a0d96bb6391e119e
+size 5361736696
diff --git a/model-00073-of-00142.safetensors b/model-00073-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..44a0bc550630ff9cb65970e300e8e243cdda17ca
--- /dev/null
+++ b/model-00073-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c0c4eb8ed0dd56eebb81a0d44f6570e758e561a86cd29645024ee8c3c074690f
+size 5363339104
diff --git a/model-00074-of-00142.safetensors b/model-00074-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..3d5413f3d14bdd436b6e5d844ffb680e952b150a
--- /dev/null
+++ b/model-00074-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:978db2195765b2259f71de6583d59f859091570bfaa8f548cf775e99396dd535
+size 5361736696
diff --git a/model-00075-of-00142.safetensors b/model-00075-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..04a579c0561db26244c8f77c567f0ff2a0264a0d
--- /dev/null
+++ b/model-00075-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ee63e9bb88509ab0089756e68a80795558a5a2fa2d19c631b66835955197d4da
+size 5363339112
diff --git a/model-00076-of-00142.safetensors b/model-00076-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..36851d0227e9cf3f56c5df7f54c7aedc0e88bb6f
--- /dev/null
+++ b/model-00076-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7df10590ad98767b7c412c550c0debf8943de696d2acc422553e55945a52dd7f
+size 5361736592
diff --git a/model-00077-of-00142.safetensors b/model-00077-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..3478fbaf35951f5b8b9baa65d162d11ae29e50b6
--- /dev/null
+++ b/model-00077-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fb120541f2f25a39fd8e4b592492a365658bad9d460dabbb41e55346b0542e47
+size 5363339224
diff --git a/model-00078-of-00142.safetensors b/model-00078-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..d06c82eeba11d25518f72f50d387d2a3e829344d
--- /dev/null
+++ b/model-00078-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9ed29e4169e21d333cae9f47dc66760a8030ec4d8274b8ad17b28fef03a001ea
+size 5361736456
diff --git a/model-00079-of-00142.safetensors b/model-00079-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..fb75da9ae96c946250e6584c3f9b5ed2dd291799
--- /dev/null
+++ b/model-00079-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:87e4dfa5e801ffeb773671f187a981b4e7454a47006f733181a446fc381803cb
+size 5363338784
diff --git a/model-00080-of-00142.safetensors b/model-00080-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..0bc3e9dd62c36e9e09ef1886a47f08a222e48d83
--- /dev/null
+++ b/model-00080-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a92f502b1234c8909dcd7d20ef558bda611a8989ab00f71a306a260767661a0b
+size 5361735472
diff --git a/model-00081-of-00142.safetensors b/model-00081-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..9dff3580cc0c58d6c5266945bc6f6216253e897c
--- /dev/null
+++ b/model-00081-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9fafe2068fcc0ec959cb44ceb11f3e8f2caf00e76b1725ff18daeaf84e07ac70
+size 5363339344
diff --git a/model-00082-of-00142.safetensors b/model-00082-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..0e9753633bdf49ffaa82a7ac9a3661f80d98979b
--- /dev/null
+++ b/model-00082-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5a6351865a588d7d68852952637df7c1cd429ffd60979717dd91acb5ed8663d2
+size 5317175232
diff --git a/model-00083-of-00142.safetensors b/model-00083-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..e54327f506689fa546f78dbc09b8b3d257a17fb8
--- /dev/null
+++ b/model-00083-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:981c0ae29f9c0b0c49488e689688028105e5c6a90e00b78e98d7fd82d077c1b8
+size 5357555688
diff --git a/model-00084-of-00142.safetensors b/model-00084-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..f7d322e1ebbb2b5738f09bb63fcb024e3323603d
--- /dev/null
+++ b/model-00084-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:da16d8b9f80d7c8bf7cf120509b90fc0ea38c932614e392a1ac2d4dddccff2ec
+size 5363339104
diff --git a/model-00085-of-00142.safetensors b/model-00085-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..1e640df17f03b397eab9892a7af8c08e3bef79b0
--- /dev/null
+++ b/model-00085-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3d03bb770e771d2754f25d93e5d0d1f38899a8169b3d4335fd1a26344ec4a522
+size 5361736696
diff --git a/model-00086-of-00142.safetensors b/model-00086-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..54f5e7d90d6f22da7271afec8b617726b1513c3f
--- /dev/null
+++ b/model-00086-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:80724e44554a5eaa5f7266cd0fe3d3b3930066399d69a5f3ede191e7abb61342
+size 5363339120
diff --git a/model-00087-of-00142.safetensors b/model-00087-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..373f4c73dbc52e47ba9694ccfdb2d930f512ddf6
--- /dev/null
+++ b/model-00087-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4d2aee9f1d403c781706479fc15393c9d50ba021acaac371bda7436981171643
+size 5361736648
diff --git a/model-00088-of-00142.safetensors b/model-00088-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..b2a6a5355589549282d17d8520d3ab16a96323bf
--- /dev/null
+++ b/model-00088-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7ed2ce0f76905fad2b235aa5958b3e38d2b2d247b3248462edd6bec012e5f2af
+size 5363339160
diff --git a/model-00089-of-00142.safetensors b/model-00089-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..59fb67622b2534268dd5110788e18f78d065e59b
--- /dev/null
+++ b/model-00089-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fc8f74f5d1c59f7ae59b781db77b9e5b6b14e3afff606577ed7a022c9099f772
+size 5361736512
diff --git a/model-00090-of-00142.safetensors b/model-00090-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..35df3646e4ece72b50f5e36f4d73498fff0a74f5
--- /dev/null
+++ b/model-00090-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:60b7c55a3afd2437114230c43ed055115d13f813d9842856c631ee83f2bc75ca
+size 5363339296
diff --git a/model-00091-of-00142.safetensors b/model-00091-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..50d2cac84b4cd94b83fe646527e0cb5db4e2b155
--- /dev/null
+++ b/model-00091-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:81ba80b3e2bc846d818acc34086f268edd3220e824b65cfaf39e9de78e0efcff
+size 5361736376
diff --git a/model-00092-of-00142.safetensors b/model-00092-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..32e45b4532e9ce03b1b7894c49a93a2c1bc32268
--- /dev/null
+++ b/model-00092-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8e7b665e99148cabc64b083120befb34ce3bb6bdb3be076796fc1e387332bfec
+size 5363339432
diff --git a/model-00093-of-00142.safetensors b/model-00093-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..5884d7d5fd0195de5745fe4c26faea8ce22fdc67
--- /dev/null
+++ b/model-00093-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e8cf02a6a1d2ce2a277b7f7e412f50c83208a06db8b7b36098d61cbb030909b6
+size 5361736248
diff --git a/model-00094-of-00142.safetensors b/model-00094-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..24b13a664d0d77b66f95432baa1547d4fd2926bb
--- /dev/null
+++ b/model-00094-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ad65481085844f604e45a2062124d7136a34f656ed08d91c7e9d367c8a1a8be2
+size 5363339576
diff --git a/model-00095-of-00142.safetensors b/model-00095-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..9ed19af12bc9c5227b3d55df50aece4941576b3a
--- /dev/null
+++ b/model-00095-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1373ea4325b07ed30e871d92ae31a4d6f413a0f5aedf42cfb6ef280d58ea2647
+size 5363339104
diff --git a/model-00096-of-00142.safetensors b/model-00096-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..94cf15dd9e8a67c72d66e4a981caba107d9a5076
--- /dev/null
+++ b/model-00096-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ec77fa04a9ae727212106274fbac1c783ad0e40a446064fd20fb5ebc91f97dce
+size 5361736696
diff --git a/model-00097-of-00142.safetensors b/model-00097-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..a8e29484ff118e6d97baff2dd1fe274dcdc6aad4
--- /dev/null
+++ b/model-00097-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ed15272340c1d88dc8ed31615fe739d31bde94285ce376adab69eebab617893d
+size 5363339120
diff --git a/model-00098-of-00142.safetensors b/model-00098-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..09eaa40794c8c5ed8d37fb9a3cfdfb4c9c8e1ab2
--- /dev/null
+++ b/model-00098-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d30bad76ba7422cba887b828324b40b2aa045cdef9d88584318d3f5896603b09
+size 5361736696
diff --git a/model-00099-of-00142.safetensors b/model-00099-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..e076a5514b52726bdb937186f009e5ddc7fecd5e
--- /dev/null
+++ b/model-00099-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b64e2913fef5c20490e41f5b48c09de50bd810def10eb31b73a73eb82eadae97
+size 5363338784
diff --git a/model-00100-of-00142.safetensors b/model-00100-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..4ea190e700a03096838e8fc80a7f7632763498f8
--- /dev/null
+++ b/model-00100-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5f3901b64d28af66a091c950edd784e0abaaada4220632494204fee8bb7a7257
+size 5361735712
diff --git a/model-00101-of-00142.safetensors b/model-00101-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..94052530fa05f027a37f0423b7df44885a33644a
--- /dev/null
+++ b/model-00101-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b8b320af495b1f8b203e0de3cca19abf78db0936db7b7113fbf6d1236049b754
+size 5363338872
diff --git a/model-00102-of-00142.safetensors b/model-00102-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..dde536df4e054cf463e8987db67ff53d832e810d
--- /dev/null
+++ b/model-00102-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:873e97c71b26deb73b1767b2b31f61477a1f48a15c6866025990465a84dbaa2f
+size 5361736424
diff --git a/model-00103-of-00142.safetensors b/model-00103-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..ec60c6e8e3b43f9a3a97e50b016f56309bd3b6ae
--- /dev/null
+++ b/model-00103-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9e76012a962bcac8301a9a81af5a5acd2284085814ddab24c9a2c511cbf82fbd
+size 5363339384
diff --git a/model-00104-of-00142.safetensors b/model-00104-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..27a4af91ac9a8db3cd1049b22ac9aa97d0ba9c47
--- /dev/null
+++ b/model-00104-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b73dbfe8a8d96453413bb6ad94fb255e66dbe3d59502ebb38f5cddb968501946
+size 5361736296
diff --git a/model-00105-of-00142.safetensors b/model-00105-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..fb871a0d7d289fc2ae3d2658a2b2c3e153306682
--- /dev/null
+++ b/model-00105-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6a954d9068edef54cd27e627cccaa3b09cff9b3b58e3519b3fbef823f9a4ab14
+size 5363339520
diff --git a/model-00106-of-00142.safetensors b/model-00106-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..39a320b780ba5f93aa248abfe1cdf413d8fa18b4
--- /dev/null
+++ b/model-00106-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9b741d263cc5105391c6a9d8ddc9c320148dd291fd119e8f3d145265306fc9bf
+size 5363339104
diff --git a/model-00107-of-00142.safetensors b/model-00107-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..aef9e81ddf4a5a7e5efc5aa9f17c5a32f17be2a6
--- /dev/null
+++ b/model-00107-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cd8a18b196255838ecc3c996690a933cc274794390cfebd7368cca7b80d564da
+size 5361736696
diff --git a/model-00108-of-00142.safetensors b/model-00108-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..446b95abd1d3f14b0cd5a0930c615ee0b973391f
--- /dev/null
+++ b/model-00108-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:67484855cd50dda6304040c4513cfe8af6d1b927e97e222ded9160e9e184fd14
+size 5363339104
diff --git a/model-00109-of-00142.safetensors b/model-00109-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..a165a4f551300de53843f8e5a584fe9e2a8094b0
--- /dev/null
+++ b/model-00109-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a5f5038fc9785c8dee01b19ad9e1322164cf54830398a247b631a8743ce3f575
+size 5361736696
diff --git a/model-00110-of-00142.safetensors b/model-00110-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..b10ccd8f421ddb5886e0bbc2beee7823b7a7de0c
--- /dev/null
+++ b/model-00110-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:18671e2f94f58f2ec04fde2a9ffcc69a0067ed5f9e0c2ea53a5da4546516dc26
+size 5363339112
diff --git a/model-00111-of-00142.safetensors b/model-00111-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..3d2db650ef080503443343ccea358b7be03d6a58
--- /dev/null
+++ b/model-00111-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1b5928d158a1c5f6397417379fc97fc9f3dd70153787724e427e72432caff5f5
+size 5361736616
diff --git a/model-00112-of-00142.safetensors b/model-00112-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..54b10dcd7fe824a7a376b8ba7b796e5d581d6e07
--- /dev/null
+++ b/model-00112-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:55c451b30d06b64be28bf4d32883f3cefe7a561d6b7623834c58276dc647c344
+size 5363339200
diff --git a/model-00113-of-00142.safetensors b/model-00113-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..8d063877b1a0d10fa82f76935b465a30d8985acf
--- /dev/null
+++ b/model-00113-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6352977f557e3cd8bc7507df15fd625a1714120897e08b00d819656db017c0bb
+size 5361736480
diff --git a/model-00114-of-00142.safetensors b/model-00114-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..e4c4e069cfd0d1c1b0b8fa27ff371f0a6ebce7bf
--- /dev/null
+++ b/model-00114-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5fd83b52ac8f08738f5fa8e2c74ce16033946bcc262d2458049254e76ea5f2d9
+size 5363339328
diff --git a/model-00115-of-00142.safetensors b/model-00115-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..d2c3ace852bb06e716e9d041ee95cb5265bebb6b
--- /dev/null
+++ b/model-00115-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:314fc05e8699bccd5420806525620ffad96d4cf6e085613f3718afdbfc715d47
+size 5361736344
diff --git a/model-00116-of-00142.safetensors b/model-00116-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..7ce61a5debadacfc9e6241d0fbfe17eed670150c
--- /dev/null
+++ b/model-00116-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:053efffafb9bc63e45fb5f31b0ca2be4ab32a44fa4d290a8367e8a55ada2da4c
+size 5363339456
diff --git a/model-00117-of-00142.safetensors b/model-00117-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..3ebaa4475b159f683f31a66b12290bc54b5d794f
--- /dev/null
+++ b/model-00117-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:be7ecc4f974fe76113a32c68f3b0d4c5139025114dad01f1982e155aaf2d3605
+size 5364883192
diff --git a/model-00118-of-00142.safetensors b/model-00118-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..130bf37aac0bda7b903d56b4e9c9bc8bb2fe4146
--- /dev/null
+++ b/model-00118-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:55cead6933f98f1044d2706fea44837066676d7e179fe18772c07316593b13d9
+size 5360192648
diff --git a/model-00119-of-00142.safetensors b/model-00119-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..b02e842d7873a037b702249a58ec9303eb43936a
--- /dev/null
+++ b/model-00119-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:437b5164ef8bffc0c477c239fbccc4b813f206ca5984104fa3a8207f4625e9ed
+size 5363339008
diff --git a/model-00120-of-00142.safetensors b/model-00120-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..9a00b04aca1f7d8f0dc9c42efab0832f0998d647
--- /dev/null
+++ b/model-00120-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:23b49900d74f63554c61be9dadb5946903e00bb852081fba77b26d77af064ac8
+size 5361735840
diff --git a/model-00121-of-00142.safetensors b/model-00121-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..9c68909272673fe47c7f3cbdc946c26b77a3eca7
--- /dev/null
+++ b/model-00121-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:51afac838d8e84927ce68eaadcbe4273606f69094e29a762ccebc35be2c8c681
+size 5363338504
diff --git a/model-00122-of-00142.safetensors b/model-00122-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..45ee771773dc40401293dfe4a1200f26d569a905
--- /dev/null
+++ b/model-00122-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d735a8b726c38fef009c766e2fcc89f328bf9d468c0335fb57847959631523c0
+size 5361736656
diff --git a/model-00123-of-00142.safetensors b/model-00123-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..1a360fb7774c738d131e5d5302fffbc3c04afed4
--- /dev/null
+++ b/model-00123-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0ba6b15e37307b72dfb1343baab5d5469708f9ac48603439bcedf05a8bd4c81b
+size 5363339152
diff --git a/model-00124-of-00142.safetensors b/model-00124-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..ac99db8bc52b0adcf48e27b1d307262fd3b3ba64
--- /dev/null
+++ b/model-00124-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0f4f759a35ccab24d6b3e8e7b549cfe0e720828ed31cb909847fde2755df0627
+size 5361736520
diff --git a/model-00125-of-00142.safetensors b/model-00125-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..6dcc06a3161b12a610e66f3bd6ecfeb6ca5dc22d
--- /dev/null
+++ b/model-00125-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6fe518d2a1936ad6b2a7aa4f0b6dbd7452822261be7d1e063754410e0796147b
+size 5363339288
diff --git a/model-00126-of-00142.safetensors b/model-00126-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..946965319a433224a2c0cc1a7bbfc932a13653e1
--- /dev/null
+++ b/model-00126-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2d00348fc44eb54efd8e3e1705ddb9e76113bf8ccc4de13b2d76f4ade8e3155a
+size 5361736392
diff --git a/model-00127-of-00142.safetensors b/model-00127-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..cbd08d859bb4d9632d94615e143dcadd8505943e
--- /dev/null
+++ b/model-00127-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cf652a702fa2d8c662a066d0a5bee79549d17882ab272cb74e3e4e03d44335bc
+size 5363339416
diff --git a/model-00128-of-00142.safetensors b/model-00128-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..8166cd88a3cad39411acf520dc801d678e4eade3
--- /dev/null
+++ b/model-00128-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:61db42ba6631fdb07d819c120ff9f61f4ff69efd018e094ef3c2376b38033f06
+size 5361736256
diff --git a/model-00129-of-00142.safetensors b/model-00129-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..ef386324cc41fce3a71979ec4085ab506c47ab08
--- /dev/null
+++ b/model-00129-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:72dd93b904c074a6908fb89aec499c7e1707924a9f55e5dc5223d640d86752fe
+size 5363339560
diff --git a/model-00130-of-00142.safetensors b/model-00130-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..c7b375bc095e4e020f9a42bf32cbd9f211972c6f
--- /dev/null
+++ b/model-00130-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e962ce71a2e0dd5232265843fadee064e961b2fa290b5ae3f1e82128f11ca6f2
+size 5363339104
diff --git a/model-00131-of-00142.safetensors b/model-00131-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..4df74c5cab3144811599d6392de150c914c7f75f
--- /dev/null
+++ b/model-00131-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0bc16ccebde523df12e57d3e49514d86073564326d1a75f6c5b929ec557520e3
+size 5361736696
diff --git a/model-00132-of-00142.safetensors b/model-00132-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..b5ad95278b96c7f6880fa7248028ac4ffa7343c5
--- /dev/null
+++ b/model-00132-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d30285d669521cd0ad2a17b0e9d142c86624c364947cb15fec8e5677ab493301
+size 5363339104
diff --git a/model-00133-of-00142.safetensors b/model-00133-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..e1e158e0c0d8f27944a3b22e7b23ec4e7ebec485
--- /dev/null
+++ b/model-00133-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e604cc4a28638f58d0e19332838bf019c4f649981c94dc0110155ca2ef536ab9
+size 5361736696
diff --git a/model-00134-of-00142.safetensors b/model-00134-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..b851c61ae4df230bf20a71cc17c6d07deaf15bb3
--- /dev/null
+++ b/model-00134-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:25f2497a9b0ff1be584fb8367d240ae48d774e39712b29150726e18a740bda9b
+size 5363339120
diff --git a/model-00135-of-00142.safetensors b/model-00135-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..85f86806f8b83bcd5c758208642ab3fc113874d1
--- /dev/null
+++ b/model-00135-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f35babc4edd3a78d04a0f6a018ccdecc5c7f9f5d147cb26a0f15c0bb151ee71d
+size 5361736568
diff --git a/model-00136-of-00142.safetensors b/model-00136-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..c2565bcaff759bfa1700ac6b4b1d6f72608cb815
--- /dev/null
+++ b/model-00136-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:705bd4c571863f08e9d6c2e40bca87add686764de0af40dd2eb36889f150a150
+size 5363324248
diff --git a/model-00137-of-00142.safetensors b/model-00137-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..98ef91bcd7189d42f724404f8630bebbc66bc57b
--- /dev/null
+++ b/model-00137-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e802972d6ad81a10fb18cbf054834c1700aa2d67e025116c8451036d5a70b8db
+size 5361736464
diff --git a/model-00138-of-00142.safetensors b/model-00138-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..d6bbc83c150135461e3580e6f9cbb0526b6e676a
--- /dev/null
+++ b/model-00138-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a37ce395ff4d5e7e2e57283e0bf5633c5bd867b87d0a874b2e5e604a9200bc22
+size 5363351176
diff --git a/model-00139-of-00142.safetensors b/model-00139-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..15ad43a90bdb1c56fe2a5ea7452d4830da04fe9d
--- /dev/null
+++ b/model-00139-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:862e5ebd79fdbfd1406a227740efd5e7d55090afddd66c2a660285bbd496a534
+size 5361735472
diff --git a/model-00140-of-00142.safetensors b/model-00140-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..8c65e78434b02616ad5a0bb818e1ef4a15c05aad
--- /dev/null
+++ b/model-00140-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9c013efa52322d2e9d3033987df200c89420b5c1124b52568497f67de21d7287
+size 5363338640
diff --git a/model-00141-of-00142.safetensors b/model-00141-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..69754c47086d445b7a9aff2bccd15d6c8ec4ab90
--- /dev/null
+++ b/model-00141-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fba01737a521b4edd93b51b966730b12fcc609b3fff7cae776b195a847f87431
+size 5342346848
diff --git a/model-00142-of-00142.safetensors b/model-00142-of-00142.safetensors
new file mode 100644
index 0000000000000000000000000000000000000000..332dac3ae29a5201ff9b76d470797653c6c22506
--- /dev/null
+++ b/model-00142-of-00142.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:68f1a5ce91c5fc330725bfca39aa63ba6f72eb02daaa85ed2ee4bfa7f00d0e5d
+size 146853760
diff --git a/model.safetensors.index.json b/model.safetensors.index.json
new file mode 100644
index 0000000000000000000000000000000000000000..b462c0bcd8028c8c87650fa6d00cda11cd585c5b
--- /dev/null
+++ b/model.safetensors.index.json
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a5c43b9fd27c3ba067265426673431c9cc747d8a5b27e1dab9df9c0fd575e371
+size 11396202
diff --git a/tokenizer.json b/tokenizer.json
new file mode 100644
index 0000000000000000000000000000000000000000..aba40197a4cdb5607f4ab7a05fb0a4ee8054fd6d
--- /dev/null
+++ b/tokenizer.json
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:19e773648cb4e65de8660ea6365e10acca112d42a854923df93db4a6f333a82d
+size 20217442
diff --git a/tokenizer_config.json b/tokenizer_config.json
new file mode 100644
index 0000000000000000000000000000000000000000..50ef91a5c0e787b36a059d8f3c85fb9e9456a5dc
--- /dev/null
+++ b/tokenizer_config.json
@@ -0,0 +1,36 @@
+{
+ "backend": "tokenizers",
+ "bos_token": null,
+ "clean_up_tokenization_spaces": false,
+ "do_lower_case": false,
+ "eos_token": "<|endoftext|>",
+ "extra_special_tokens": [
+ "<|endoftext|>",
+ "[MASK]",
+ "[gMASK]",
+ "[sMASK]",
+ "",
+ "",
+ "<|system|>",
+ "<|user|>",
+ "<|assistant|>",
+ "<|observation|>",
+ "<|begin_of_image|>",
+ "<|end_of_image|>",
+ "<|begin_of_video|>",
+ "<|end_of_video|>",
+ "<|begin_of_audio|>",
+ "<|end_of_audio|>",
+ "<|begin_of_transcription|>",
+ "<|end_of_transcription|>"
+ ],
+ "is_local": true,
+ "model_max_length": 202752,
+ "model_specific_special_tokens": {},
+ "pad_token": "[MASK]",
+ "padding_side": "left",
+ "remove_space": false,
+ "tokenizer_class": "TokenizersBackend",
+ "unk_token": null,
+ "chat_template": "[gMASK]\n{%- if tools -%}\n{%- macro tool_to_json(tool) -%}\n {%- set ns_tool = namespace(first=true) -%}\n {{ '{' -}}\n {%- for k, v in tool.items() -%}\n {%- if k != 'defer_loading' and k != 'strict' -%}\n {%- if not ns_tool.first -%}{{- ', ' -}}{%- endif -%}\n {%- set ns_tool.first = false -%}\n \"{{ k }}\": {{ v | tojson(ensure_ascii=False) }}\n {%- endif -%}\n {%- endfor -%}\n {{- '}' -}}\n{%- endmacro -%}\n<|system|>\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within XML tags:\n\n{% for tool in tools %}\n{%- if 'function' in tool -%}\n {%- set tool = tool['function'] -%}\n{%- endif -%}\n{% if tool.defer_loading is not defined or not tool.defer_loading %}\n{{ tool_to_json(tool) }}\n{% endif %}\n{% endfor %}\n\n\nFor each function call, output the function name and arguments within the following XML format:\n{function-name}{arg-key-1}{arg-value-1}{arg-key-2}{arg-value-2}...{%- endif -%}\n{%- macro visible_text(content) -%}\n {%- if content is string -%}\n {{- content }}\n {%- elif content is iterable and content is not mapping -%}\n {%- for item in content -%}\n {%- if item is mapping and item.type == 'text' -%}\n {{- item.text }}\n {%- elif item is string -%}\n {{- item }}\n {%- endif -%}\n {%- endfor -%}\n {%- else -%}\n {{- content }}\n {%- endif -%}\n{%- endmacro -%}\n{%- set ns = namespace(last_user_index=-1, thinking_indices='') -%}\n{%- for m in messages %}\n {%- if m.role == 'user' %}\n {%- set ns.last_user_index = loop.index0 -%}\n {%- elif m.role == 'assistant' %}\n {%- if m.reasoning_content is string %}\n {%- set ns.thinking_indices = ns.thinking_indices ~ ',' ~ ns.last_user_index ~ ',' -%}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- set ns.has_thinking = false -%}\n{%- for m in messages -%}\n{%- if m.role == 'user' -%}<|user|>{{ visible_text(m.content) }}{% set ns.has_thinking = (',' ~ loop.index0 ~ ',') in ns.thinking_indices -%}\n{%- elif m.role == 'assistant' -%}\n<|assistant|>\n{%- set content = visible_text(m.content) %}\n{%- if m.reasoning_content is string %}\n {%- set reasoning_content = m.reasoning_content %}\n{%- elif '' in content %}\n {%- set reasoning_content = content.split('')[0].split('')[-1] %}\n {%- set content = content.split('')[-1] %}\n{%- elif loop.index0 > ns.last_user_index and not (enable_thinking is defined and not enable_thinking) %}\n {%- set reasoning_content = '' %}\n{%- elif loop.index0 < ns.last_user_index and ns.has_thinking %}\n {%- set reasoning_content = '' %}\n{%- endif %}\n{%- if ((clear_thinking is defined and not clear_thinking) or loop.index0 > ns.last_user_index) and reasoning_content is defined -%}\n{{ '' + reasoning_content + ''}}\n{%- else -%}\n{{ '' }}\n{%- endif -%}\n{%- if content.strip() -%}\n{{ content.strip() }}\n{%- endif -%}\n{% if m.tool_calls %}\n{% for tc in m.tool_calls %}\n{%- if tc.function %}\n {%- set tc = tc.function %}\n{%- endif %}\n{{- '' + tc.name -}}\n{% set _args = tc.arguments %}{% for k, v in _args.items() %}{{ k }}{{ v | tojson(ensure_ascii=False) if v is not string else v }}{% endfor %}{% endfor %}\n{% endif %}\n{%- elif m.role == 'tool' -%}\n{%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|observation|>' -}}\n{%- endif %}\n{%- if m.content is string -%}\n {{- '' + m.content + '' -}}\n{%- else -%}\n {{- '\\n' -}}\n {% for tr in m.content %}\n {%- for tool in tools -%}\n {%- if 'function' in tool -%}\n {%- set tool = tool['function'] -%}\n {%- endif -%}\n {%- if tool.name == tr.name -%}\n {{- tool_to_json(tool) + '\\n' -}}\n {%- endif -%}\n {%- endfor -%}\n {%- endfor -%}\n {{- '' -}}\n{% endif -%}\n{%- elif m.role == 'system' -%}\n<|system|>{{ visible_text(m.content) }}\n{%- endif -%}\n{%- endfor -%}\n{%- if add_generation_prompt -%}\n <|assistant|>{{- '' if (enable_thinking is defined and not enable_thinking) else '' -}}\n{%- endif -%}"
+}
\ No newline at end of file