zhiyucheng commited on
Commit
47fa7dc
·
verified ·
1 Parent(s): b2dd9ee

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +5 -16
README.md CHANGED
@@ -36,7 +36,7 @@ Global <br>
36
  Developers looking to take off-the-shelf, pre-quantized models for deployment in AI Agent systems, chatbots, RAG systems, and other AI-powered applications. <br>
37
 
38
  ### Release Date: <br>
39
- Huggingface [TBD] via https://huggingface.co/nvidia/GLM-4.7-NVFP4 <br>
40
 
41
  ## Model Architecture:
42
  **Architecture Type:** Transformers <br>
@@ -53,7 +53,7 @@ Huggingface [TBD] via https://huggingface.co/nvidia/GLM-4.7-NVFP4 <br>
53
  **Output Type(s):** Text <br>
54
  **Output Format:** String <br>
55
  **Output Parameters:** 1D (One-Dimensional): Sequences <br>
56
- **Other Properties Related to Output:** N/A <br>
57
 
58
  Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions. <br>
59
 
@@ -78,10 +78,11 @@ The model version is NVFP4 1.0 version and is quantized with nvidia-modelopt **v
78
  ** Link: [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail), [Nemotron-Post-Training-Dataset-v2](https://huggingface.co/datasets/nvidia/Nemotron-Post-Training-Dataset-v2) <br>
79
  ** Data Collection Method by dataset: Automated. <br>
80
  ** Labeling method: Automated. <br>
81
- ** Properties: The cnn_dailymail dataset is an English-language dataset containing just over 300k unique news articles as written by journalists at CNN and the Daily Mail. <br>
82
 
83
  ## Training Dataset:
84
- ** Data Modality: Undisclosed <br>
 
85
  ** Data Collection Method by dataset: Undisclosed <br>
86
  ** Labeling Method by dataset: Undisclosed<br>
87
  ** Properties: Undisclosed
@@ -129,10 +130,6 @@ The accuracy benchmark results are presented in the table below:
129
  </td>
130
  <td><strong>AIME 2025</strong>
131
  </td>
132
- <td><strong>tau2_bench_telecom</strong>
133
- </td>
134
- <td><strong>ns_aa_lcr</strong>
135
- </td>
136
  </tr>
137
  <tr>
138
  <td>FP8
@@ -147,10 +144,6 @@ The accuracy benchmark results are presented in the table below:
147
  </td>
148
  <td>0.960
149
  </td>
150
- <td>0.982
151
- </td>
152
- <td>0.653
153
- </td>
154
  </tr>
155
  <tr>
156
  <td>NVFP4
@@ -165,10 +158,6 @@ The accuracy benchmark results are presented in the table below:
165
  </td>
166
  <td>0.960
167
  </td>
168
- <td>0.968
169
- </td>
170
- <td>0.640
171
- </td>
172
  </tr>
173
  </table>
174
 
 
36
  Developers looking to take off-the-shelf, pre-quantized models for deployment in AI Agent systems, chatbots, RAG systems, and other AI-powered applications. <br>
37
 
38
  ### Release Date: <br>
39
+ Huggingface 03/25/2026 via https://huggingface.co/nvidia/GLM-4.7-NVFP4 <br>
40
 
41
  ## Model Architecture:
42
  **Architecture Type:** Transformers <br>
 
53
  **Output Type(s):** Text <br>
54
  **Output Format:** String <br>
55
  **Output Parameters:** 1D (One-Dimensional): Sequences <br>
56
+ **Other Properties Related to Output:** None <br>
57
 
58
  Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions. <br>
59
 
 
78
  ** Link: [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail), [Nemotron-Post-Training-Dataset-v2](https://huggingface.co/datasets/nvidia/Nemotron-Post-Training-Dataset-v2) <br>
79
  ** Data Collection Method by dataset: Automated. <br>
80
  ** Labeling method: Automated. <br>
81
+ ** Properties: The cnn_dailymail dataset is an English-language dataset containing just over 300k unique news articles as written by journalists at CNN and the Daily Mail.<br> Nemotron-Post-Training-Dataset-v2 dataset is NVIDIA’s post-training dataset with an extension of SFT and RL data into five target languages: Spanish, French, German, Italian and Japanese. The data supports improvements of math, code, general reasoning, and instruction following capabilities. <br>
82
 
83
  ## Training Dataset:
84
+ ** Data Modality: Text <br>
85
+ ** Text Training Data Size: Undisclosed <br>
86
  ** Data Collection Method by dataset: Undisclosed <br>
87
  ** Labeling Method by dataset: Undisclosed<br>
88
  ** Properties: Undisclosed
 
130
  </td>
131
  <td><strong>AIME 2025</strong>
132
  </td>
 
 
 
 
133
  </tr>
134
  <tr>
135
  <td>FP8
 
144
  </td>
145
  <td>0.960
146
  </td>
 
 
 
 
147
  </tr>
148
  <tr>
149
  <td>NVFP4
 
158
  </td>
159
  <td>0.960
160
  </td>
 
 
 
 
161
  </tr>
162
  </table>
163