Files changed (1) hide show
  1. README.md +18 -14
README.md CHANGED
@@ -74,24 +74,24 @@ The integration of foundation and fine-tuned models into AI systems requires add
74
  ## Training, Testing, and Evaluation Datasets:
75
 
76
  ## Calibration Dataset:
77
- ** Link: [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail), [Nemotron-Post-Training-Dataset-v2](https://huggingface.co/datasets/nvidia/Nemotron-Post-Training-Dataset-v2) <br>
78
- ** Data collection method: Automated. <br>
79
- ** Labeling method: Automated. <br>
80
 
81
  ## Training Datasets:
82
- ** Data Collection Method by Dataset: Undisclosed <br>
83
- ** Labeling Method by Dataset: Undisclosed<br>
84
- ** Properties: Undisclosed
85
 
86
  ## Testing Dataset:
87
- ** Data Collection Method by Dataset: Undisclosed <br>
88
- ** Labeling Method by Dataset: Undisclosed <br>
89
- ** Properties: Undisclosed <br>
90
 
91
  ## Evaluation Dataset:
92
  * Datasets: MMLU Pro, GPQA Diamond, LiveCodeBench V6, SciCode, AIME 2025 <br>
93
- ** Data collection method: Hybrid: Automated, Human <br>
94
- ** Labeling method: Hybrid: Human, Automated <br>
95
 
96
 
97
  ## Inference:
@@ -99,7 +99,7 @@ The integration of foundation and fine-tuned models into AI systems requires add
99
  **Test Hardware:** B300 <br>
100
 
101
  ## Post Training Quantization
102
- This model was obtained by quantizing the weights and activations of Qwen3-Coder-Next to NVFP4 data type, ready for inference with TensorRT-LLM. Only the weights and activations of the linear operators within transformer blocks are quantized. This optimization reduces the number of bits per parameter from 16 to 4, reducing the disk size and GPU memory requirements by approximately 4x.
103
 
104
  ## Usage
105
 
@@ -110,8 +110,12 @@ To serve the quantized NVFP4 checkpoint with [SGLang](https://github.com/sgl-pro
110
  ```bash
111
  sglang serve --model-path vincentzed-hf/Qwen3-Coder-Next-NVFP4 --quantization modelopt_fp4
112
  ```
113
- Please use this branch and install from source: https://github.com/sgl-project/sglang/pull/18224
114
- Once the branch is cloned, do `pip install -e .` annd run the serve command.
 
 
 
 
115
 
116
  ### Reproduce with ModelOpt
117
 
 
74
  ## Training, Testing, and Evaluation Datasets:
75
 
76
  ## Calibration Dataset:
77
+ * Link: [Nemotron-Post-Training-Dataset-v2](https://huggingface.co/datasets/nvidia/Nemotron-Post-Training-Dataset-v2) <br>
78
+ * Data collection method: Automated. <br>
79
+ * Labeling method: Automated. <br>
80
 
81
  ## Training Datasets:
82
+ * Data Collection Method by Dataset: Undisclosed <br>
83
+ * Labeling Method by Dataset: Undisclosed<br>
84
+ * Properties: Undisclosed
85
 
86
  ## Testing Dataset:
87
+ * Data Collection Method by Dataset: Undisclosed <br>
88
+ * Labeling Method by Dataset: Undisclosed <br>
89
+ * Properties: Undisclosed <br>
90
 
91
  ## Evaluation Dataset:
92
  * Datasets: MMLU Pro, GPQA Diamond, LiveCodeBench V6, SciCode, AIME 2025 <br>
93
+ * Data collection method: Hybrid: Automated, Human <br>
94
+ * Labeling method: Hybrid: Human, Automated <br>
95
 
96
 
97
  ## Inference:
 
99
  **Test Hardware:** B300 <br>
100
 
101
  ## Post Training Quantization
102
+ This model was obtained by quantizing the weights and activations of Qwen3-Coder-Next to NVFP4 data type, ready for inference with SGLang. Only the weights and activations of the linear operators within transformer blocks are quantized, as well as the KV-cache to FP8. This optimization reduces the number of bits per parameter from 16 to 4, reducing the disk size and GPU memory requirements by approximately 4x.
103
 
104
  ## Usage
105
 
 
110
  ```bash
111
  sglang serve --model-path vincentzed-hf/Qwen3-Coder-Next-NVFP4 --quantization modelopt_fp4
112
  ```
113
+ Please install from source:
114
+ `git clone git@github.com:sgl-project/sglang.git`
115
+ Once the repo is cloned, do `uv pip install -e . "python"` and run the serve command.
116
+ When a release is cut with the bugfix for this model's launch, we will update this model card.
117
+
118
+
119
 
120
  ### Reproduce with ModelOpt
121