Instructions to use 0xSero/Trinity-337B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use 0xSero/Trinity-337B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="0xSero/Trinity-337B", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("0xSero/Trinity-337B", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("0xSero/Trinity-337B", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use 0xSero/Trinity-337B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "0xSero/Trinity-337B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "0xSero/Trinity-337B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/0xSero/Trinity-337B
- SGLang
How to use 0xSero/Trinity-337B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "0xSero/Trinity-337B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "0xSero/Trinity-337B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "0xSero/Trinity-337B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "0xSero/Trinity-337B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use 0xSero/Trinity-337B with Docker Model Runner:
docker model run hf.co/0xSero/Trinity-337B
Trinity-Large-Thinking pruned to 216/256 experts (15.6%% pruned) via REAP
Browse filesThis view is limited to 50 files because it contains too many changes. See raw diff
- .gitattributes +1 -0
- chat_template.jinja +159 -0
- config.json +108 -0
- configuration_afmoe.py +133 -0
- generation_config.json +9 -0
- model-00001.safetensors +3 -0
- model-00002.safetensors +3 -0
- model-00003.safetensors +3 -0
- model-00004.safetensors +3 -0
- model-00005.safetensors +3 -0
- model-00006.safetensors +3 -0
- model-00007.safetensors +3 -0
- model-00008.safetensors +3 -0
- model-00009.safetensors +3 -0
- model-00010.safetensors +3 -0
- model-00011.safetensors +3 -0
- model-00012.safetensors +3 -0
- model-00013.safetensors +3 -0
- model-00014.safetensors +3 -0
- model-00015.safetensors +3 -0
- model-00016.safetensors +3 -0
- model-00017.safetensors +3 -0
- model-00018.safetensors +3 -0
- model-00019.safetensors +3 -0
- model-00020.safetensors +3 -0
- model-00021.safetensors +3 -0
- model-00022.safetensors +3 -0
- model-00023.safetensors +3 -0
- model-00024.safetensors +3 -0
- model-00025.safetensors +3 -0
- model-00026.safetensors +3 -0
- model-00027.safetensors +3 -0
- model-00028.safetensors +3 -0
- model-00029.safetensors +3 -0
- model-00030.safetensors +3 -0
- model-00031.safetensors +3 -0
- model-00032.safetensors +3 -0
- model-00033.safetensors +3 -0
- model-00034.safetensors +3 -0
- model-00035.safetensors +3 -0
- model-00036.safetensors +3 -0
- model-00037.safetensors +3 -0
- model-00038.safetensors +3 -0
- model-00039.safetensors +3 -0
- model-00040.safetensors +3 -0
- model-00041.safetensors +3 -0
- model-00042.safetensors +3 -0
- model-00043.safetensors +3 -0
- model-00044.safetensors +3 -0
- model-00045.safetensors +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
chat_template.jinja
ADDED
|
@@ -0,0 +1,159 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<|begin_of_text|>{%- macro render_extra_keys(json_dict, handled_keys) -%}
|
| 2 |
+
{%- if json_dict is mapping %}
|
| 3 |
+
{%- for json_key in json_dict if json_key not in handled_keys %}
|
| 4 |
+
{%- if json_dict[json_key] is mapping or (json_dict[json_key] is sequence and json_dict[json_key] is not string) %}
|
| 5 |
+
{{- '\n<' ~ json_key ~ '>' ~ (json_dict[json_key] | tojson | safe) ~ '</' ~ json_key ~ '>' }}
|
| 6 |
+
{%- else %}
|
| 7 |
+
{{- '\n<' ~ json_key ~ '>' ~ (json_dict[json_key] | string) ~ '</' ~ json_key ~ '>' }}
|
| 8 |
+
{%- endif %}
|
| 9 |
+
{%- endfor %}
|
| 10 |
+
{%- endif %}
|
| 11 |
+
{%- endmacro -%}
|
| 12 |
+
|
| 13 |
+
{%- macro render_tool_call(raw_tool_call) -%}
|
| 14 |
+
{%- if raw_tool_call.function is defined and raw_tool_call.function is mapping %}
|
| 15 |
+
{%- set tool_call = raw_tool_call.function %}
|
| 16 |
+
{%- else %}
|
| 17 |
+
{%- set tool_call = raw_tool_call %}
|
| 18 |
+
{%- endif %}
|
| 19 |
+
{{- '<tool_call>\n<function=' + (tool_call.name | default('') | string) + '>\n' }}
|
| 20 |
+
{%- if tool_call.arguments is defined and tool_call.arguments is mapping %}
|
| 21 |
+
{%- for args_name, args_value in tool_call.arguments.items() %}
|
| 22 |
+
{{- '<parameter=' + (args_name | string) + '>\n' }}
|
| 23 |
+
{%- if args_value is mapping or (args_value is sequence and args_value is not string) %}
|
| 24 |
+
{{- args_value | tojson | safe }}
|
| 25 |
+
{%- else %}
|
| 26 |
+
{{- args_value | string }}
|
| 27 |
+
{%- endif %}
|
| 28 |
+
{{- '\n</parameter>\n' }}
|
| 29 |
+
{%- endfor %}
|
| 30 |
+
{%- endif %}
|
| 31 |
+
{{- '</function>\n</tool_call>' }}
|
| 32 |
+
{%- endmacro -%}
|
| 33 |
+
|
| 34 |
+
{%- set system_message = none %}
|
| 35 |
+
{%- if messages and messages[0]["role"] == "system" %}
|
| 36 |
+
{%- set system_message = messages[0]["content"] %}
|
| 37 |
+
{%- set loop_messages = messages[1:] %}
|
| 38 |
+
{%- else %}
|
| 39 |
+
{%- set loop_messages = messages %}
|
| 40 |
+
{%- endif %}
|
| 41 |
+
|
| 42 |
+
{%- if not tools is defined %}
|
| 43 |
+
{%- set tools = [] %}
|
| 44 |
+
{%- endif %}
|
| 45 |
+
{%- set has_tools = tools is iterable and tools is not string and tools | length > 0 %}
|
| 46 |
+
|
| 47 |
+
{%- if system_message is not none or has_tools %}
|
| 48 |
+
{{- '<|im_start|>system\n' }}
|
| 49 |
+
{%- if system_message is not none %}
|
| 50 |
+
{{- system_message }}
|
| 51 |
+
{%- else %}
|
| 52 |
+
{{- "You are Trinity Large, a helpful assistant developed by Arcee AI, that can interact with a computer to solve tasks." }}
|
| 53 |
+
{%- endif %}
|
| 54 |
+
{%- if has_tools %}
|
| 55 |
+
{{- "\n\n# Tools\n\nYou have access to the following functions:\n\n<tools>" }}
|
| 56 |
+
{%- for tool in tools %}
|
| 57 |
+
{%- if tool.function is defined and tool.function is mapping %}
|
| 58 |
+
{%- set tool = tool.function %}
|
| 59 |
+
{%- endif %}
|
| 60 |
+
{{- '\n<function>\n<name>' ~ (tool.name | default('') | string) ~ '</name>' }}
|
| 61 |
+
{%- if tool.description is defined and tool.description is not none %}
|
| 62 |
+
{{- '\n<description>' ~ (tool.description | string | trim) ~ '</description>' }}
|
| 63 |
+
{%- endif %}
|
| 64 |
+
{{- '\n<parameters>' }}
|
| 65 |
+
{%- if tool.parameters is defined and tool.parameters is mapping and tool.parameters.properties is defined and tool.parameters.properties is mapping %}
|
| 66 |
+
{%- for param_name, param_fields in tool.parameters.properties.items() %}
|
| 67 |
+
{{- '\n<parameter>\n<name>' ~ (param_name | string) ~ '</name>' }}
|
| 68 |
+
{%- if param_fields is mapping and param_fields.type is defined and param_fields.type is not none %}
|
| 69 |
+
{{- '\n<type>' ~ (param_fields.type | string) ~ '</type>' }}
|
| 70 |
+
{%- endif %}
|
| 71 |
+
{%- if param_fields is mapping and param_fields.description is defined and param_fields.description is not none %}
|
| 72 |
+
{{- '\n<description>' ~ (param_fields.description | string | trim) ~ '</description>' }}
|
| 73 |
+
{%- endif %}
|
| 74 |
+
{%- if param_fields is mapping %}
|
| 75 |
+
{%- set handled_keys = ['name', 'type', 'description'] %}
|
| 76 |
+
{{- render_extra_keys(param_fields, handled_keys) }}
|
| 77 |
+
{%- endif %}
|
| 78 |
+
{{- '\n</parameter>' }}
|
| 79 |
+
{%- endfor %}
|
| 80 |
+
{%- endif %}
|
| 81 |
+
{%- if tool.parameters is defined %}
|
| 82 |
+
{%- set handled_keys = ['type', 'properties'] %}
|
| 83 |
+
{{- render_extra_keys(tool.parameters, handled_keys) }}
|
| 84 |
+
{%- endif %}
|
| 85 |
+
{{- '\n</parameters>' }}
|
| 86 |
+
{%- set handled_keys = ['type', 'name', 'description', 'parameters'] %}
|
| 87 |
+
{{- render_extra_keys(tool, handled_keys) }}
|
| 88 |
+
{{- '\n</function>' }}
|
| 89 |
+
{%- endfor %}
|
| 90 |
+
{{- "\n</tools>" }}
|
| 91 |
+
{{- '\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n<tool_call>\n<function=example_function_name>\n<parameter=example_parameter_1>\nvalue_1\n</parameter>\n<parameter=example_parameter_2>\nThis is the value for the second parameter\nthat can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>\n\n<IMPORTANT>\nReminder:\n- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags\n- Required parameters MUST be specified\n- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after\n- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls\n</IMPORTANT>' }}
|
| 92 |
+
{%- endif %}
|
| 93 |
+
{{- '<|im_end|>\n' }}
|
| 94 |
+
{%- endif %}
|
| 95 |
+
|
| 96 |
+
{%- for message in loop_messages %}
|
| 97 |
+
{%- set role = message.role | default('') %}
|
| 98 |
+
{%- if role == "assistant" %}
|
| 99 |
+
{%- set content_str = '' if message.content is none else (message.content | string) %}
|
| 100 |
+
{%- set trimmed_content = content_str | trim %}
|
| 101 |
+
|
| 102 |
+
{%- set has_reasoning_content = message.reasoning_content is defined %}
|
| 103 |
+
{%- set has_reasoning = has_reasoning_content or (message.reasoning is defined) %}
|
| 104 |
+
|
| 105 |
+
{%- if has_reasoning_content %}
|
| 106 |
+
{%- set reasoning_value = message.reasoning_content %}
|
| 107 |
+
{%- elif message.reasoning is defined %}
|
| 108 |
+
{%- set reasoning_value = message.reasoning %}
|
| 109 |
+
{%- else %}
|
| 110 |
+
{%- set reasoning_value = none %}
|
| 111 |
+
{%- endif %}
|
| 112 |
+
|
| 113 |
+
{%- set has_tool_calls = message.tool_calls is defined and message.tool_calls is iterable and message.tool_calls is not string and message.tool_calls | length > 0 %}
|
| 114 |
+
|
| 115 |
+
{{- '<|im_start|>assistant\n' }}
|
| 116 |
+
{%- if has_reasoning %}
|
| 117 |
+
{%- if reasoning_value %}
|
| 118 |
+
{{- '<think>' + (reasoning_value | string | trim) + '</think>' }}
|
| 119 |
+
{%- else %}
|
| 120 |
+
{{- '<think></think>' }}
|
| 121 |
+
{%- endif %}
|
| 122 |
+
{%- if trimmed_content %}
|
| 123 |
+
{{- '\n' + trimmed_content }}
|
| 124 |
+
{%- endif %}
|
| 125 |
+
{%- elif has_tool_calls %}
|
| 126 |
+
{%- if trimmed_content %}
|
| 127 |
+
{{- trimmed_content }}
|
| 128 |
+
{%- endif %}
|
| 129 |
+
{%- else %}
|
| 130 |
+
{{- content_str }}
|
| 131 |
+
{%- endif %}
|
| 132 |
+
|
| 133 |
+
{%- if has_tool_calls %}
|
| 134 |
+
{%- for tool_call in message.tool_calls %}
|
| 135 |
+
{%- set separator = '\n' if ((loop.first and (has_reasoning or trimmed_content)) or (not loop.first)) else '' -%}
|
| 136 |
+
{{- separator + render_tool_call(tool_call) }}
|
| 137 |
+
{%- endfor %}
|
| 138 |
+
{%- endif %}
|
| 139 |
+
{{- '<|im_end|>\n' }}
|
| 140 |
+
{%- elif role == "tool" or role == "observation" or role == "function" %}
|
| 141 |
+
{%- if loop.first or loop.previtem.role not in ["tool", "observation", "function"] %}
|
| 142 |
+
{{- '<|im_start|>user\n' }}
|
| 143 |
+
{%- endif %}
|
| 144 |
+
{{- '<tool_response>\n' }}
|
| 145 |
+
{{- '' if message.content is none else (message.content | string) }}
|
| 146 |
+
{{- '\n</tool_response>\n' }}
|
| 147 |
+
{%- if loop.last or loop.nextitem.role not in ["tool", "observation", "function"] %}
|
| 148 |
+
{{- '<|im_end|>\n' }}
|
| 149 |
+
{%- endif %}
|
| 150 |
+
{%- else %}
|
| 151 |
+
{{- '<|im_start|>' + (role | string) }}
|
| 152 |
+
{{- '\n' + ('' if message.content is none else (message.content | string)) }}
|
| 153 |
+
{{- '<|im_end|>\n' }}
|
| 154 |
+
{%- endif %}
|
| 155 |
+
{%- endfor %}
|
| 156 |
+
|
| 157 |
+
{%- if add_generation_prompt %}
|
| 158 |
+
{{- '<|im_start|>assistant\n<think>' }}
|
| 159 |
+
{%- endif %}
|
config.json
ADDED
|
@@ -0,0 +1,108 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"AfmoeForCausalLM"
|
| 4 |
+
],
|
| 5 |
+
"attention_dropout": 0.0,
|
| 6 |
+
"auto_map": {
|
| 7 |
+
"AutoConfig": "configuration_afmoe.AfmoeConfig",
|
| 8 |
+
"AutoModel": "modeling_afmoe.AfmoeModel",
|
| 9 |
+
"AutoModelForCausalLM": "modeling_afmoe.AfmoeForCausalLM"
|
| 10 |
+
},
|
| 11 |
+
"dtype": "bfloat16",
|
| 12 |
+
"global_attn_every_n_layers": 4,
|
| 13 |
+
"head_dim": 128,
|
| 14 |
+
"hidden_act": "silu",
|
| 15 |
+
"hidden_size": 3072,
|
| 16 |
+
"initializer_range": 0.02,
|
| 17 |
+
"intermediate_size": 12288,
|
| 18 |
+
"layer_types": [
|
| 19 |
+
"sliding_attention",
|
| 20 |
+
"sliding_attention",
|
| 21 |
+
"sliding_attention",
|
| 22 |
+
"full_attention",
|
| 23 |
+
"sliding_attention",
|
| 24 |
+
"sliding_attention",
|
| 25 |
+
"sliding_attention",
|
| 26 |
+
"full_attention",
|
| 27 |
+
"sliding_attention",
|
| 28 |
+
"sliding_attention",
|
| 29 |
+
"sliding_attention",
|
| 30 |
+
"full_attention",
|
| 31 |
+
"sliding_attention",
|
| 32 |
+
"sliding_attention",
|
| 33 |
+
"sliding_attention",
|
| 34 |
+
"full_attention",
|
| 35 |
+
"sliding_attention",
|
| 36 |
+
"sliding_attention",
|
| 37 |
+
"sliding_attention",
|
| 38 |
+
"full_attention",
|
| 39 |
+
"sliding_attention",
|
| 40 |
+
"sliding_attention",
|
| 41 |
+
"sliding_attention",
|
| 42 |
+
"full_attention",
|
| 43 |
+
"sliding_attention",
|
| 44 |
+
"sliding_attention",
|
| 45 |
+
"sliding_attention",
|
| 46 |
+
"full_attention",
|
| 47 |
+
"sliding_attention",
|
| 48 |
+
"sliding_attention",
|
| 49 |
+
"sliding_attention",
|
| 50 |
+
"full_attention",
|
| 51 |
+
"sliding_attention",
|
| 52 |
+
"sliding_attention",
|
| 53 |
+
"sliding_attention",
|
| 54 |
+
"full_attention",
|
| 55 |
+
"sliding_attention",
|
| 56 |
+
"sliding_attention",
|
| 57 |
+
"sliding_attention",
|
| 58 |
+
"full_attention",
|
| 59 |
+
"sliding_attention",
|
| 60 |
+
"sliding_attention",
|
| 61 |
+
"sliding_attention",
|
| 62 |
+
"full_attention",
|
| 63 |
+
"sliding_attention",
|
| 64 |
+
"sliding_attention",
|
| 65 |
+
"sliding_attention",
|
| 66 |
+
"full_attention",
|
| 67 |
+
"sliding_attention",
|
| 68 |
+
"sliding_attention",
|
| 69 |
+
"sliding_attention",
|
| 70 |
+
"full_attention",
|
| 71 |
+
"sliding_attention",
|
| 72 |
+
"sliding_attention",
|
| 73 |
+
"sliding_attention",
|
| 74 |
+
"full_attention",
|
| 75 |
+
"sliding_attention",
|
| 76 |
+
"sliding_attention",
|
| 77 |
+
"sliding_attention",
|
| 78 |
+
"full_attention"
|
| 79 |
+
],
|
| 80 |
+
"load_balance_coeff": 5e-05,
|
| 81 |
+
"max_position_embeddings": 262144,
|
| 82 |
+
"model_type": "afmoe",
|
| 83 |
+
"moe_intermediate_size": 3072,
|
| 84 |
+
"mup_enabled": true,
|
| 85 |
+
"n_group": 1,
|
| 86 |
+
"num_attention_heads": 48,
|
| 87 |
+
"num_dense_layers": 6,
|
| 88 |
+
"num_expert_groups": 1,
|
| 89 |
+
"num_experts": 216,
|
| 90 |
+
"num_experts_per_tok": 4,
|
| 91 |
+
"num_hidden_layers": 60,
|
| 92 |
+
"num_key_value_heads": 8,
|
| 93 |
+
"num_limited_groups": 1,
|
| 94 |
+
"num_shared_experts": 1,
|
| 95 |
+
"rms_norm_eps": 1e-05,
|
| 96 |
+
"rope_scaling": null,
|
| 97 |
+
"rope_theta": 10000,
|
| 98 |
+
"route_norm": true,
|
| 99 |
+
"route_scale": 2.448,
|
| 100 |
+
"score_func": "sigmoid",
|
| 101 |
+
"sliding_window": 4096,
|
| 102 |
+
"tie_word_embeddings": false,
|
| 103 |
+
"topk_group": 1,
|
| 104 |
+
"transformers_version": "4.57.1",
|
| 105 |
+
"use_cache": true,
|
| 106 |
+
"use_grouped_mm": true,
|
| 107 |
+
"vocab_size": 200192
|
| 108 |
+
}
|
configuration_afmoe.py
ADDED
|
@@ -0,0 +1,133 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# coding=utf-8
|
| 2 |
+
# Copyright 2022 EleutherAI and the HuggingFace Inc. team. All rights reserved.
|
| 3 |
+
#
|
| 4 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
| 5 |
+
# you may not use this file except in compliance with the License.
|
| 6 |
+
# You may obtain a copy of the License at
|
| 7 |
+
#
|
| 8 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
| 9 |
+
#
|
| 10 |
+
# Unless required by applicable law or agreed to in writing, software
|
| 11 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
| 12 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
| 13 |
+
# See the License for the specific language governing permissions and
|
| 14 |
+
# limitations under the License.
|
| 15 |
+
from transformers.configuration_utils import PretrainedConfig
|
| 16 |
+
from transformers.modeling_rope_utils import rope_config_validation
|
| 17 |
+
from transformers.configuration_utils import layer_type_validation
|
| 18 |
+
from transformers.utils import logging
|
| 19 |
+
|
| 20 |
+
logger = logging.get_logger(__name__)
|
| 21 |
+
|
| 22 |
+
class AfmoeConfig(PretrainedConfig):
|
| 23 |
+
"""
|
| 24 |
+
n_group (`int`, *optional*, defaults to 1):
|
| 25 |
+
Number of groups for routed experts.
|
| 26 |
+
topk_group (`int`, *optional*, defaults to 1):
|
| 27 |
+
Number of selected groups for each token(for each token, ensuring the selected experts is only within `topk_group` groups).
|
| 28 |
+
"""
|
| 29 |
+
model_type = "afmoe"
|
| 30 |
+
base_model_pp_plan = {
|
| 31 |
+
"embed_tokens": (["input_ids"], ["inputs_embeds"]),
|
| 32 |
+
"layers": (["hidden_states", "attention_mask"], ["hidden_states"]),
|
| 33 |
+
"norm": (["hidden_states"], ["hidden_states"]),
|
| 34 |
+
}
|
| 35 |
+
|
| 36 |
+
def __init__(
|
| 37 |
+
self,
|
| 38 |
+
num_hidden_layers: int = 32,
|
| 39 |
+
vocab_size: int = 200192,
|
| 40 |
+
hidden_size: int = 2048,
|
| 41 |
+
intermediate_size: int = 6144,
|
| 42 |
+
moe_intermediate_size=1408,
|
| 43 |
+
num_dense_layers=1,
|
| 44 |
+
num_attention_heads=16,
|
| 45 |
+
num_key_value_heads=None,
|
| 46 |
+
head_dim=128,
|
| 47 |
+
hidden_act="silu",
|
| 48 |
+
max_position_embeddings=16384,
|
| 49 |
+
initializer_range=0.02,
|
| 50 |
+
rms_norm_eps=1e-5,
|
| 51 |
+
use_cache=True,
|
| 52 |
+
tie_word_embeddings=False,
|
| 53 |
+
rope_theta=10000.0,
|
| 54 |
+
rope_scaling=None,
|
| 55 |
+
num_experts=64,
|
| 56 |
+
num_experts_per_tok=6,
|
| 57 |
+
num_shared_experts=2,
|
| 58 |
+
num_expert_groups=1,
|
| 59 |
+
num_limited_groups=1,
|
| 60 |
+
score_func="sigmoid",
|
| 61 |
+
route_norm=True,
|
| 62 |
+
route_scale=1.0,
|
| 63 |
+
global_attn_every_n_layers=4,
|
| 64 |
+
sliding_window=1024,
|
| 65 |
+
mup_enabled=False,
|
| 66 |
+
layer_types=None,
|
| 67 |
+
attention_dropout: float = 0.0,
|
| 68 |
+
n_group: int = 1,
|
| 69 |
+
topk_group: int = 1,
|
| 70 |
+
**kwargs,
|
| 71 |
+
):
|
| 72 |
+
self.vocab_size = vocab_size
|
| 73 |
+
self.max_position_embeddings = max_position_embeddings
|
| 74 |
+
self.hidden_size = hidden_size
|
| 75 |
+
self.intermediate_size = intermediate_size
|
| 76 |
+
self.num_hidden_layers = num_hidden_layers
|
| 77 |
+
self.num_dense_layers = num_dense_layers
|
| 78 |
+
self.num_attention_heads = num_attention_heads
|
| 79 |
+
self.head_dim = head_dim
|
| 80 |
+
self.hidden_act = hidden_act
|
| 81 |
+
self.initializer_range = initializer_range
|
| 82 |
+
self.rms_norm_eps = rms_norm_eps
|
| 83 |
+
self.use_cache = use_cache
|
| 84 |
+
self.rope_theta = rope_theta
|
| 85 |
+
self.rope_scaling = rope_scaling
|
| 86 |
+
|
| 87 |
+
|
| 88 |
+
# MoE specific
|
| 89 |
+
self.moe_intermediate_size = moe_intermediate_size
|
| 90 |
+
self.num_experts_per_tok = num_experts_per_tok
|
| 91 |
+
self.n_group = n_group
|
| 92 |
+
self.topk_group = topk_group
|
| 93 |
+
self.num_experts = num_experts
|
| 94 |
+
self.num_shared_experts = num_shared_experts
|
| 95 |
+
self.num_expert_groups = num_expert_groups
|
| 96 |
+
self.num_limited_groups = num_limited_groups
|
| 97 |
+
self.score_func = score_func
|
| 98 |
+
self.route_norm = route_norm
|
| 99 |
+
self.route_scale = route_scale
|
| 100 |
+
|
| 101 |
+
|
| 102 |
+
# Attention specific
|
| 103 |
+
self.attention_dropout = attention_dropout
|
| 104 |
+
self.global_attn_every_n_layers = global_attn_every_n_layers
|
| 105 |
+
self.sliding_window = sliding_window
|
| 106 |
+
self.layer_types = layer_types
|
| 107 |
+
if self.layer_types is None:
|
| 108 |
+
self.layer_types = [
|
| 109 |
+
"sliding_attention" if bool((i + 1) % global_attn_every_n_layers) else "full_attention" for i in range(self.num_hidden_layers)
|
| 110 |
+
]
|
| 111 |
+
layer_type_validation(self.layer_types)
|
| 112 |
+
|
| 113 |
+
# muP specific
|
| 114 |
+
self.mup_enabled = mup_enabled
|
| 115 |
+
|
| 116 |
+
if num_key_value_heads is None:
|
| 117 |
+
num_key_value_heads = num_attention_heads
|
| 118 |
+
|
| 119 |
+
self.num_key_value_heads = num_key_value_heads
|
| 120 |
+
|
| 121 |
+
|
| 122 |
+
# Validate rope configs
|
| 123 |
+
if self.rope_scaling is not None and "type" in self.rope_scaling:
|
| 124 |
+
self.rope_scaling["rope_type"] = self.rope_scaling["type"]
|
| 125 |
+
rope_config_validation(self)
|
| 126 |
+
|
| 127 |
+
super().__init__(
|
| 128 |
+
tie_word_embeddings=tie_word_embeddings,
|
| 129 |
+
**kwargs,
|
| 130 |
+
)
|
| 131 |
+
|
| 132 |
+
|
| 133 |
+
__all__ = ["AfmoeConfig"]
|
generation_config.json
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_from_model_config": true,
|
| 3 |
+
"bos_token_id": 0,
|
| 4 |
+
"eos_token_id": 3,
|
| 5 |
+
"pad_token_id": 12,
|
| 6 |
+
"transformers_version": "4.57.3",
|
| 7 |
+
"temperature": 0.8,
|
| 8 |
+
"top_p": 0.8
|
| 9 |
+
}
|
model-00001.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6bd8a3cc805dd3ea7aa74f75868cd85822f0afea34c4f94c492b72c65b8af15d
|
| 3 |
+
size 5360493784
|
model-00002.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:13e4e7f733cd6582137c9f08119ca26691bbc314220ce3af3a51280074b15cb2
|
| 3 |
+
size 5360356008
|
model-00003.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5cbbb30e8a276de5137e97d4b95d8c7450262698cd0e932fd6f90d585fb2c344
|
| 3 |
+
size 5360355736
|
model-00004.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1d60471d21c292d57ef2095a4a32d2b7280a76c0c37c330f1f18f897c9459e14
|
| 3 |
+
size 5355418136
|
model-00005.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:35b6186979edd8c9ae1709d4849a93a00037b799a90728f64e0c760303cd8ba1
|
| 3 |
+
size 5360355848
|
model-00006.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0148725021fbc2c6603337fc188aab41d6a57b8719c0653c4df493d78d596eff
|
| 3 |
+
size 5355417968
|
model-00007.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7259a2ded0dd44f041a76ccdf40415c0516b6906475245bdd5005a22cd9a006c
|
| 3 |
+
size 5360356072
|
model-00008.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:199041c7f6017b29b8a19a690a5013356a2e35b216026d21780cd550990e65bb
|
| 3 |
+
size 5355417768
|
model-00009.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:31d0480511c4b6a84d6753c5ada5c2239c71c510ff18de54790896836e3ff108
|
| 3 |
+
size 5360356024
|
model-00010.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a692d00b8ffc0f045646db46a3ec26a620410214b74f7978fd6b5a0d9713892e
|
| 3 |
+
size 5360355704
|
model-00011.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f9415043e5558ddcbf3a63ea10459b000992fb5b3bedb3e46795f9cb1095cdf1
|
| 3 |
+
size 5355418424
|
model-00012.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1b74548545423bfd7a21437a0ccc7907732a77962605ef6d011d7476d2dd1545
|
| 3 |
+
size 5360356104
|
model-00013.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0fac7b7fdffad874bb8b22109f889bcce8ef35797bef5953e582df42210d7ace
|
| 3 |
+
size 5355418304
|
model-00014.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:726c387732495dc8e0f2cf19f82129932a6f2407a02f83d1c6bed67dfc592640
|
| 3 |
+
size 5360356336
|
model-00015.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:97be8da6a5c0d7f200d965ac06062ef0bd8e82c5af63d676bd2c9a54cc587878
|
| 3 |
+
size 5355418064
|
model-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fe15ccf1d6c9b9c4bdf5e6230629e2ef7796cb63e3a294af77d48a046dcd9f4a
|
| 3 |
+
size 5360356320
|
model-00017.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0a023e0d90deac8c0278dd28c3602a24ffa136eba0b0713c9cb5614a69415d13
|
| 3 |
+
size 5360355960
|
model-00018.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:eb8016b56c4c56df7d62cc6e4ef4de239f6bf93a5d0fe5ac4376565c60581251
|
| 3 |
+
size 5355418456
|
model-00019.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:262d142fcbd151ceec5f2d81cba2da58205f11049486c1eff63613dd6eb5bc7e
|
| 3 |
+
size 5360356080
|
model-00020.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:02dcc9a34e7c3f20c1192e1b248dd241b9fa153e3768727a8ba700655738181b
|
| 3 |
+
size 5355418336
|
model-00021.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:52f6b47d3a3ad8ea372e5382784226c449d2573c701a3d50648ba712ab807913
|
| 3 |
+
size 5360356296
|
model-00022.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5ea1577db0da1f7553632ff8f2c29c7f8d748ccfe3a5723b650a155ace49ba10
|
| 3 |
+
size 5355418096
|
model-00023.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:98345f1bf8578a6eb64915723009bd55f539a3e5119f73e3142a4924e75306b7
|
| 3 |
+
size 5360356336
|
model-00024.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b49703085da342ab03bb06710b026b2dff87c6efb1a940b482afd1494ff1eb98
|
| 3 |
+
size 5360355936
|
model-00025.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2d8c08ab4accb32f31105cc98278cbaa840a735fd2ef1c0018511e902f607d63
|
| 3 |
+
size 5355418472
|
model-00026.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:455dcc08abcd8da2c2a4f5cf42b745c21b12ef7d4c542510e031a01fd54d201b
|
| 3 |
+
size 5360356072
|
model-00027.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:56408ce7f0bcc6977eaf27be319e1f7032ea2bf0eaf3cfbb70a9c2edea801dad
|
| 3 |
+
size 5355418360
|
model-00028.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:be05fb0f0aad47f6ea80a440b30049c63ecbca7f63782cb9523515459b3717e0
|
| 3 |
+
size 5360356248
|
model-00029.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4e3b49517cdc346f9f1b5b307276477167a4ef878002a6dd2e99529d4e89fdc3
|
| 3 |
+
size 5355418128
|
model-00030.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c779ba856def73057efc535ef1fc5e7c5f082e1ef8d3bc00bac1001689f75716
|
| 3 |
+
size 5360356352
|
model-00031.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8cf307e21b738fb5c7059e6cd605b7e5216f8df7e0e99c1f701e4d0caef00e1e
|
| 3 |
+
size 5355418112
|
model-00032.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e5154539c76729c8d02a240922d72b2b668af41404c2287281928a9bc20db9da
|
| 3 |
+
size 5360356280
|
model-00033.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ec388a6b1850cc4570c6f98b2e8f76686480e42eb0bc401fa294131d006706ac
|
| 3 |
+
size 5360356056
|
model-00034.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:377644117ca5f2d762d91a8534be72b11a703a4ba5b209c1e47dead8f8be21db
|
| 3 |
+
size 5355418392
|
model-00035.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5f794c8df17b3fcaed8703205ce36bf90f8f96a67ef7866a7e1c2201a60b5c38
|
| 3 |
+
size 5360356208
|
model-00036.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5fd5a9a20983c19de79bdc78d80778224be7b4d5252ed8141174c9cc4a94ec46
|
| 3 |
+
size 5355418160
|
model-00037.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f866471f89c16668fb5525354eca9298971eed578d8545d7a2bf5f86144788fd
|
| 3 |
+
size 5360356360
|
model-00038.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3bcd4ca116947f5d9146c5ad32e7be1ec6ad5a5578dc002c1d3b1c9d87827699
|
| 3 |
+
size 5355418096
|
model-00039.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:97aadeb189ff7a057752a81b8a53b164af010b8ad538c318fcf9cd76f2359e23
|
| 3 |
+
size 5360356288
|
model-00040.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:25ea3b83a2c88b920840aadff967be424d66ad5efd3951333041bfdbb481210e
|
| 3 |
+
size 5360356040
|
model-00041.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a4b1ec268400a33cd0918fc40aecf94c91c99260f23516f5cfb6f1313165c1bb
|
| 3 |
+
size 5355418416
|
model-00042.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:001c9aba6352681e25bc5273fff06aca32876a0f2121a779d1fd2f233945cac9
|
| 3 |
+
size 5360356168
|
model-00043.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:836ff1af0bfa793202d75464d1d6e5c2c0173cbfeaaa3d077c70c60272343ff5
|
| 3 |
+
size 5355418200
|
model-00044.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:789c2ff30ba31ed0987bb322ef9b6da2d02715d7925792c732badcd2292b0737
|
| 3 |
+
size 5360356360
|
model-00045.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b64d45e3b88083195b92f2defd0c94ac084dffdd003a99118dee1deb46e03218
|
| 3 |
+
size 5355418080
|