GLM-4.7-NVFP4

Format: NVFP4 — optimal partial quantization of weights & activations to NVFP4.
Base model: zai-org/GLM-4.7
How it was made: AutoQuantized with NVIDIA Model-Optimizer (NVFP4) with x8 RTX PRO 6000s, using the default calibration mix. (cnn_dailymail and nemotron-post-training-dataset-v2)

Check the original model card for information about this model.


MMLU Benchmark Results: Salyut1/GLM-4.7-NVFP4

Summary Table

Groups Version Metric Value Stderr
MMLU (Total) 2 acc ↑ 0.8348 ± 0.0030
Social Sciences 2 acc ↑ 0.9051 ± 0.0052
Other 2 acc ↑ 0.8684 ± 0.0058
STEM 2 acc ↑ 0.8351 ± 0.0064
Humanities 2 acc ↑ 0.7664 ± 0.0059

STEM

Tasks n-shot Metric Value Stderr
High School Biology 0 acc ↑ 0.9516 ± 0.0122
College Biology 0 acc ↑ 0.9514 ± 0.0180
Astronomy 0 acc ↑ 0.9474 ± 0.0182
High School Computer Science 0 acc ↑ 0.9300 ± 0.0256
Conceptual Physics 0 acc ↑ 0.9064 ± 0.0190
Elementary Mathematics 0 acc ↑ 0.8862 ± 0.0164
Electrical Engineering 0 acc ↑ 0.8690 ± 0.0281
High School Statistics 0 acc ↑ 0.8565 ± 0.0239
College Computer Science 0 acc ↑ 0.8400 ± 0.0368
Anatomy 0 acc ↑ 0.8296 ± 0.0325
High School Physics 0 acc ↑ 0.7947 ± 0.0330
High School Chemistry 0 acc ↑ 0.7882 ± 0.0287
Machine Learning 0 acc ↑ 0.7679 ± 0.0401
College Physics 0 acc ↑ 0.7647 ± 0.0422
Abstract Algebra 0 acc ↑ 0.6800 ± 0.0469
College Chemistry 0 acc ↑ 0.6800 ± 0.0469
College Mathematics 0 acc ↑ 0.6800 ± 0.0469
High School Mathematics 0 acc ↑ 0.6481 ± 0.0291

Social Sciences

Tasks n-shot Metric Value Stderr
High School Government/Politics 0 acc ↑ 0.9793 ± 0.0103
High School Microeconomics 0 acc ↑ 0.9706 ± 0.0110
High School Psychology 0 acc ↑ 0.9523 ± 0.0091
Human Sexuality 0 acc ↑ 0.9313 ± 0.0222
Sociology 0 acc ↑ 0.9204 ± 0.0191
High School Geography 0 acc ↑ 0.9192 ± 0.0194
High School Macroeconomics 0 acc ↑ 0.9000 ± 0.0152
US Foreign Policy 0 acc ↑ 0.9000 ± 0.0302
Professional Psychology 0 acc ↑ 0.8725 ± 0.0135
Security Studies 0 acc ↑ 0.8653 ± 0.0219
Public Relations 0 acc ↑ 0.7636 ± 0.0407
Econometrics 0 acc ↑ 0.7544 ± 0.0405

Humanities

Tasks n-shot Metric Value Stderr
High School US History 0 acc ↑ 0.9461 ± 0.0159
High School World History 0 acc ↑ 0.9367 ± 0.0158
World Religions 0 acc ↑ 0.9064 ± 0.0223
Prehistory 0 acc ↑ 0.8981 ± 0.0168
International Law 0 acc ↑ 0.8926 ± 0.0283
Jurisprudence 0 acc ↑ 0.8889 ± 0.0304
Logical Fallacies 0 acc ↑ 0.8834 ± 0.0252
High School European History 0 acc ↑ 0.8788 ± 0.0255
Moral Disputes 0 acc ↑ 0.8699 ± 0.0181
Philosophy 0 acc ↑ 0.8617 ± 0.0196
Formal Logic 0 acc ↑ 0.7460 ± 0.0389
Professional Law 0 acc ↑ 0.6610 ± 0.0121
Moral Scenarios 0 acc ↑ 0.6425 ± 0.0160

Other

Tasks n-shot Metric Value Stderr
Medical Genetics 0 acc ↑ 0.9800 ± 0.0141
Marketing 0 acc ↑ 0.9530 ± 0.0139
Miscellaneous 0 acc ↑ 0.9374 ± 0.0087
Professional Medicine 0 acc ↑ 0.9301 ± 0.0155
Clinical Knowledge 0 acc ↑ 0.9057 ± 0.0180
Nutrition 0 acc ↑ 0.9052 ± 0.0168
Management 0 acc ↑ 0.8932 ± 0.0306
Business Ethics 0 acc ↑ 0.8600 ± 0.0349
Computer Security 0 acc ↑ 0.8600 ± 0.0349
Human Aging 0 acc ↑ 0.8161 ± 0.0260
College Medicine 0 acc ↑ 0.7977 ± 0.0306
Professional Accounting 0 acc ↑ 0.7624 ± 0.0254
Global Facts 0 acc ↑ 0.6500 ± 0.0479
Virology 0 acc ↑ 0.5723 ± 0.0385

vLLM Inference Note:

I needed to patch vllm/model_executor/models/glm4_moe.py to skip specific k_scale and v_scale parameters if they are missing from the checkpoint, rather than crashing. The below script fixed my k_scale and v_scale errors.

import sys
import os
import re

# Path to the vLLM model file
path = '/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/glm4_moe.py'

if os.path.exists(path):
    with open(path, 'r') as f:
        lines = f.readlines()
    
    target_str = 'param = params_dict[name]'
    new_lines = []
    patched = False
    
    for line in lines:
        # We look for the parameter loading line
        if target_str in line and 'k_scale' not in line:
            whitespace = re.match(r'^(\s*)', line).group(1)
            
            # Inject logic: If asking for k_scale/v_scale and it's missing, skip
            payload = f"{whitespace}if ('k_scale' in name or 'v_scale' in name) and name not in params_dict: continue\n"
            
            new_lines.append(payload)
            new_lines.append(line)
            patched = True
        else:
            new_lines.append(line)
            
    if patched:
        with open(path, 'w') as f:
            f.writelines(new_lines)
        print(f"Successfully patched {path}")
    else:
        print("File already patched or target not found.")
Downloads last month
452
Safetensors
Model size
177B params
Tensor type
BF16
·
F32
·
F8_E4M3
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Salyut1/GLM-4.7-NVFP4

Base model

zai-org/GLM-4.7
Quantized
(16)
this model

Datasets used to train Salyut1/GLM-4.7-NVFP4