nitrogen-quant — pre-calibrated FP8 / NVFP4 NitroGen DiT exports

Pre-built artifacts for the inference-pipeline-benchmark harness's FP8 / NVFP4 NitroGen rounds. Customers do not need to recalibrate — bench sweep downloads these automatically on first FP8/NVFP4 use.

What's in here

Folder	Precision	Denoise steps	Calibration
`fp8-steps16/`	FP8 (e4m3)	16	mixed-genre game frames (6 samples)
`fp8-steps4/`	FP8 (e4m3)	4	mixed-genre game frames (6 samples)
`nvfp4-steps16/`	NVFP4 (Blackwell-only)	16	mixed-genre game frames (6 samples)

Each folder ships:

ng_dit.onnx — ~466 MB. The DiT submodule of NitroGen 500M after modelopt PTQ calibration + QDQ-instrumented ONNX export. Quantized layers: DiT + SigLIP-2 vision tower Linears. Excluded layers (kept at compute dtype): action_head, action_decoder, proprio_emb, time_emb, timestep*, norm*, ada_ln*, pos_embed*.
meta.json — calibration metadata (precision, steps, opset, input shapes).

How to use

In your fork of the benchmark harness:

bench setup --backend nitrogen-quant
bench sweep --gpu rtx_pro6000 --sweep nitrogen-backends
# fp8 / nvfp4 rounds will hf-download from here on first run

Force local recalibration (e.g., your production frame distribution differs meaningfully from these mixed-genre calibration samples):

NITROGEN_FORCE_RECALIBRATE=1 bench sweep ...

Provenance

Source model: nvidia/NitroGen (ng.pt, June 2026)
Calibration tool: nvidia-modelopt 0.44.0
Export tool: PyTorch 2.12.1 torch.onnx.export(dynamo=False), opset 20
Builder script: scripts/build_nitrogen_artifacts.py
Manifest (sha256s): benchmarks/nitrogen_artifacts.yaml

License & attribution

These artifacts derive directly from nvidia/NitroGen weights — same NVIDIA Open Model License applies. We've only added QDQ instrumentation; the underlying weights are unchanged.

The NitroGen model itself is the work of NVIDIA / MineDojo. This repo only republishes a quantized export for benchmark convenience.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support