Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -13,8 +13,7 @@ tags:
 library_name: transformers
 ---
-# DeepSeek-R1-Distill-Qwen-14B-NVFP4 (Blackwell Optimized)
 This repository contains a self-quantized version of **DeepSeek-R1-Distill-Qwen-14B** using the **NVIDIA NVFP4** format. This was produced on an **Asus Ascent GX10** (NVIDIA GB10 Grace Blackwell) system using the NVIDIA ModelOptimizer playbook.
 ## Hardware & Architecture
@@ -23,11 +22,6 @@ This repository contains a self-quantized version of **DeepSeek-R1-Distill-Qwen-
 - **Memory:** 128GB Coherent Unified Memory (LPDDR5X)
 - **Format:** NVFP4 (4-bit Floating Point) with two-level micro-block scaling.
-## The 14B "Reasoning" Advantage
-The 14B Qwen-distill is widely considered the sweet spot for Blackwell users.
-- **Efficiency:** In NVFP4, this model consumes approximately **9GB-10GB** of VRAM, leaving massive room for KV cache (context memory) on the GX10.
-- **IQ vs Size:** Distilled from the massive 671B R1, the 14B version significantly outperforms the 8B Llama variant in complex coding and mathematical reasoning benchmarks.
 ## Current Performance Status (Jan 2026)
 Tested on vLLM, but performance on the GX10 is currently inconsistent.
 - **Stuttering:** There is a known rhythmic stutter in current vLLM builds when running NVFP4 on SM121.

 library_name: transformers
 ---
+# DeepSeek-R1-Distill-Qwen-14B-NVFP4 (Work in Progress)
 This repository contains a self-quantized version of **DeepSeek-R1-Distill-Qwen-14B** using the **NVIDIA NVFP4** format. This was produced on an **Asus Ascent GX10** (NVIDIA GB10 Grace Blackwell) system using the NVIDIA ModelOptimizer playbook.
 ## Hardware & Architecture
 - **Memory:** 128GB Coherent Unified Memory (LPDDR5X)
 - **Format:** NVFP4 (4-bit Floating Point) with two-level micro-block scaling.
 ## Current Performance Status (Jan 2026)
 Tested on vLLM, but performance on the GX10 is currently inconsistent.
 - **Stuttering:** There is a known rhythmic stutter in current vLLM builds when running NVFP4 on SM121.