nitrogen-quant β pre-calibrated FP8 / NVFP4 NitroGen DiT exports
Pre-built artifacts for the inference-pipeline-benchmark harness's FP8 / NVFP4 NitroGen rounds. Customers do not need to recalibrate β bench sweep downloads these automatically on first FP8/NVFP4 use.
What's in here
| Folder | Precision | Denoise steps | Calibration |
|---|---|---|---|
fp8-steps16/ |
FP8 (e4m3) | 16 | mixed-genre game frames (6 samples) |
fp8-steps4/ |
FP8 (e4m3) | 4 | mixed-genre game frames (6 samples) |
nvfp4-steps16/ |
NVFP4 (Blackwell-only) | 16 | mixed-genre game frames (6 samples) |
Each folder ships:
ng_dit.onnxβ ~466 MB. The DiT submodule of NitroGen 500M after modelopt PTQ calibration + QDQ-instrumented ONNX export. Quantized layers: DiT + SigLIP-2 vision tower Linears. Excluded layers (kept at compute dtype):action_head,action_decoder,proprio_emb,time_emb,timestep*,norm*,ada_ln*,pos_embed*.meta.jsonβ calibration metadata (precision, steps, opset, input shapes).
How to use
In your fork of the benchmark harness:
bench setup --backend nitrogen-quant
bench sweep --gpu rtx_pro6000 --sweep nitrogen-backends
# fp8 / nvfp4 rounds will hf-download from here on first run
Force local recalibration (e.g., your production frame distribution differs meaningfully from these mixed-genre calibration samples):
NITROGEN_FORCE_RECALIBRATE=1 bench sweep ...
Provenance
- Source model: nvidia/NitroGen (
ng.pt, June 2026) - Calibration tool: nvidia-modelopt 0.44.0
- Export tool: PyTorch 2.12.1
torch.onnx.export(dynamo=False), opset 20 - Builder script: scripts/build_nitrogen_artifacts.py
- Manifest (sha256s): benchmarks/nitrogen_artifacts.yaml
License & attribution
These artifacts derive directly from nvidia/NitroGen weights β same NVIDIA Open Model License applies. We've only added QDQ instrumentation; the underlying weights are unchanged.
The NitroGen model itself is the work of NVIDIA / MineDojo. This repo only republishes a quantized export for benchmark convenience.
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support