DeepSeek-Coder-V2-Lite-Instruct-NVFP4
NV_FP4 (4-bit floating point) quantized version of deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct.
Model Details
| Attribute | Value |
|---|---|
| Base Model | deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct |
| Quantization | NV_FP4 (E2M1 + block scaling) |
| Block Size | 32 |
| Original Size | 31.41 GB |
| Quantized Size | 9.44 GB |
| Compression | 3.33x |
NV_FP4 Format
NV_FP4 is a 4-bit floating point format optimized for NVIDIA hardware:
- Format: 1 sign bit, 2 exponent bits, 1 mantissa bit (E2M1)
- Representable values: ±{0, 0.5, 1, 1.5, 2, 3, 4, 6}
- Block-wise scaling: 32 elements per scale factor
- Target hardware: NVIDIA GPUs including Jetson/Spark
Quantization Stats
- Tensors quantized: 5207
- Tensors preserved (FP16): 84 (embeddings, norms, biases)
Files
model_nvfp4_quantized.safetensors- Quantized weight tensors (packed uint8 + fp16 scales)model_nvfp4_unquantized.safetensors- Preserved tensors (embeddings, norms, biases)nvfp4_metadata.json- Quantization metadata
Usage
from safetensors import safe_open
import torch
# FP4 E2M1 lookup table
FP4_VALUES = torch.tensor([0.0, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 6.0])
def dequantize_nvfp4(packed, scales, shape, block_size=32):
sign1 = (packed >> 7) & 1
idx1 = (packed >> 4) & 7
sign2 = (packed >> 3) & 1
idx2 = packed & 7
num_blocks = packed.shape[0]
unpacked = torch.zeros(num_blocks, block_size)
unpacked[:, 0::2] = FP4_VALUES[idx1] * (1 - 2 * sign1.float())
unpacked[:, 1::2] = FP4_VALUES[idx2] * (1 - 2 * sign2.float())
dequantized = (unpacked * scales.unsqueeze(-1)).flatten()
return dequantized[:torch.prod(torch.tensor(shape))].view(shape)
# Load and dequantize
with safe_open("model_nvfp4_quantized.safetensors", framework="pt") as f:
for key in f.keys():
if key.endswith(".packed"):
name = key.replace(".packed", "")
packed = f.get_tensor(f"{name}.packed")
scales = f.get_tensor(f"{name}.scales")
shape = tuple(f.get_tensor(f"{name}.shape").tolist())
weight = dequantize_nvfp4(packed, scales, shape)
print(f"{name}: {weight.shape}")
Base Model
DeepSeek-Coder-V2-Lite-Instruct - MoE code model:
- 16B total / 2.4B active parameters
- 64 experts, 6 activated + 2 shared
- 128K context length
License
Inherits license from base model. See original.
Model tree for DJLougen/DeepSeek-Coder-V2-Lite-Instruct-NVFP4
Base model
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct