how to fix: KeyError: 'model.layers.30.mlp.shared_expert.gate_gate_up_proj.weight'

#1
by kq - opened

(sglang) u@server:/models$ python -m sglang.launch_server --model-path /home/deaf/Qwen3-Coder-Next-AWQ-4bit --tp 4 --kv-cache-dtype fp8_e5m2 --trust-remote-code --disable-cuda-graph-padding --context-length 262144 --served-model-name qwen3-coder-next --tool-call-parser qwen3_coder --port 8000 --host 0.0.0.0

...

KeyError: 'model.layers.30.mlp.shared_expert.gate_gate_up_proj.weight'

[2026-02-05 08:48:30 TP2] Scheduler hit an exception: Traceback (most recent call last):

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 2937, in run_scheduler_process

scheduler = Scheduler(

            ^^^^^^^^^^

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 346, in init

self.init_model_worker()

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 535, in init_model_worker

self.init_tp_model_worker()

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 497, in init_tp_model_worker

self.tp_worker = TpModelWorker(

                 ^^^^^^^^^^^^^^

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/tp_worker.py", line 246, in init

self._init_model_runner()

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/tp_worker.py", line 329, in _init_model_runner

self._model_runner = ModelRunner(

                     ^^^^^^^^^^^^

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/model_executor/model_runner.py", line 383, in init

self.initialize(min_per_gpu_memory)

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/model_executor/model_runner.py", line 460, in initialize

self.load_model()

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/model_executor/model_runner.py", line 889, in load_model

self.model = self.loader.load_model(

             ^^^^^^^^^^^^^^^^^^^^^^^

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/model_loader/loader.py", line 662, in load_model

self.load_weights_and_postprocess(

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/model_loader/loader.py", line 670, in load_weights_and_postprocess

model.load_weights(weights)

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/models/qwen3_next.py", line 1047, in load_weights

param = params_dict[name]

        ~~~~~~~~~~~^^^^^^

KeyError: 'model.layers.30.mlp.shared_expert.gate_gate_up_proj.weight'

Gemini3 Pro:
Core Diagnosis
Error Location: /home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/models/qwen3_next.py (Line 1047)

Error Cause: KeyError: '...gate_gate_up_proj.weight'

Analysis: The code has an issue in the string concatenation logic when processing the shared_expert layer. It likely prepended the prefix gate_ to the layer name gate_up_proj, resulting in the malformed key gate_gate_up_proj.

Solution: Manually Patch the Source Code
Please follow the steps below to modify the file:

Open the file:

vim /home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/models/qwen3_next.py

Go to around Line 1047: Locate the part of the load_weights function that handles params_dict. You might see logic similar to this (pseudocode):

# Example problematic code (hypothetical)
if "shared_expert" in name:
    # This logic may incorrectly add a prefix
    key_name = name.replace("mlp", "mlp.shared_expert")
    # or similar string construction causing duplication

Apply the Fix (Quick Fix): Insert a simple key name correction right before the line that raises the KeyError. Since the key erroneously contains an extra gate_, remove it.

Insert the following code just before the line param = params_dict[name]:

# --- INSERT START ---
# Fix SGLang's bug in Qwen3 Shared Expert weight naming
if "gate_gate_up_proj" in name:
    name = name.replace("gate_gate_up_proj", "gate_up_proj")
# --- INSERT END ---

Then keep the original line unchanged:

param = params_dict[name]  # original code

Alternatively, if the name variable is constructed dynamically elsewhere, locate where the extra "gate_" prefix is added and remove the redundant part.

Fixed.

Thanks Ton Cao, thanks your solid work!

cyankiwi org

Thanks for using my quant and raising this problem with me! On the other hand of your fix, please redownload config.json file, and it should work with sglang :)

Sign up or log in to comment