how to fix: KeyError: 'model.layers.30.mlp.shared_expert.gate_gate_up_proj.weight'

by kq - opened Feb 5

Feb 5

(sglang) u@server:/models$ python -m sglang.launch_server --model-path /home/deaf/Qwen3-Coder-Next-AWQ-4bit --tp 4 --kv-cache-dtype fp8_e5m2 --trust-remote-code --disable-cuda-graph-padding --context-length 262144 --served-model-name qwen3-coder-next --tool-call-parser qwen3_coder --port 8000 --host 0.0.0.0

...

KeyError: 'model.layers.30.mlp.shared_expert.gate_gate_up_proj.weight'

[2026-02-05 08:48:30 TP2] Scheduler hit an exception: Traceback (most recent call last):

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 2937, in run_scheduler_process

scheduler = Scheduler(

            ^^^^^^^^^^

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 346, in init

self.init_model_worker()

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 535, in init_model_worker

self.init_tp_model_worker()

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 497, in init_tp_model_worker

self.tp_worker = TpModelWorker(

                 ^^^^^^^^^^^^^^

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/tp_worker.py", line 246, in init

self._init_model_runner()

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/managers/tp_worker.py", line 329, in _init_model_runner

self._model_runner = ModelRunner(

                     ^^^^^^^^^^^^

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/model_executor/model_runner.py", line 383, in init

self.initialize(min_per_gpu_memory)

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/model_executor/model_runner.py", line 460, in initialize

self.load_model()

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/model_executor/model_runner.py", line 889, in load_model

self.model = self.loader.load_model(

             ^^^^^^^^^^^^^^^^^^^^^^^

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/model_loader/loader.py", line 662, in load_model

self.load_weights_and_postprocess(

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/model_loader/loader.py", line 670, in load_weights_and_postprocess

model.load_weights(weights)

File "/home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/models/qwen3_next.py", line 1047, in load_weights

param = params_dict[name]

        ~~~~~~~~~~~^^^^^^

KeyError: 'model.layers.30.mlp.shared_expert.gate_gate_up_proj.weight'

Gemini3 Pro:
Core Diagnosis
Error Location: /home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/models/qwen3_next.py (Line 1047)

Error Cause: KeyError: '...gate_gate_up_proj.weight'

Analysis: The code has an issue in the string concatenation logic when processing the shared_expert layer. It likely prepended the prefix gate_ to the layer name gate_up_proj, resulting in the malformed key gate_gate_up_proj.

Solution: Manually Patch the Source Code
Please follow the steps below to modify the file:

Open the file:

vim /home/deaf/miniconda3/envs/sglang/lib/python3.11/site-packages/sglang/srt/models/qwen3_next.py

Go to around Line 1047: Locate the part of the load_weights function that handles params_dict. You might see logic similar to this (pseudocode):

# Example problematic code (hypothetical)
if "shared_expert" in name:
    # This logic may incorrectly add a prefix
    key_name = name.replace("mlp", "mlp.shared_expert")
    # or similar string construction causing duplication

Apply the Fix (Quick Fix): Insert a simple key name correction right before the line that raises the KeyError. Since the key erroneously contains an extra gate_, remove it.

Insert the following code just before the line param = params_dict[name]:

# --- INSERT START ---
# Fix SGLang's bug in Qwen3 Shared Expert weight naming
if "gate_gate_up_proj" in name:
    name = name.replace("gate_gate_up_proj", "gate_up_proj")
# --- INSERT END ---

Then keep the original line unchanged:

param = params_dict[name]  # original code

Alternatively, if the name variable is constructed dynamically elsewhere, locate where the extra "gate_" prefix is added and remove the redundant part.

Fixed.

Feb 5

Thanks Ton Cao, thanks your solid work!

cpatonn

cyankiwi org Feb 5

Thanks for using my quant and raising this problem with me! On the other hand of your fix, please redownload config.json file, and it should work with sglang :)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment