nitrogen-quant β€” pre-calibrated FP8 / NVFP4 NitroGen DiT exports

Pre-built artifacts for the inference-pipeline-benchmark harness's FP8 / NVFP4 NitroGen rounds. Customers do not need to recalibrate β€” bench sweep downloads these automatically on first FP8/NVFP4 use.

What's in here

Folder Precision Denoise steps Calibration
fp8-steps16/ FP8 (e4m3) 16 mixed-genre game frames (6 samples)
fp8-steps4/ FP8 (e4m3) 4 mixed-genre game frames (6 samples)
nvfp4-steps16/ NVFP4 (Blackwell-only) 16 mixed-genre game frames (6 samples)

Each folder ships:

  • ng_dit.onnx β€” ~466 MB. The DiT submodule of NitroGen 500M after modelopt PTQ calibration + QDQ-instrumented ONNX export. Quantized layers: DiT + SigLIP-2 vision tower Linears. Excluded layers (kept at compute dtype): action_head, action_decoder, proprio_emb, time_emb, timestep*, norm*, ada_ln*, pos_embed*.
  • meta.json β€” calibration metadata (precision, steps, opset, input shapes).

How to use

In your fork of the benchmark harness:

bench setup --backend nitrogen-quant
bench sweep --gpu rtx_pro6000 --sweep nitrogen-backends
# fp8 / nvfp4 rounds will hf-download from here on first run

Force local recalibration (e.g., your production frame distribution differs meaningfully from these mixed-genre calibration samples):

NITROGEN_FORCE_RECALIBRATE=1 bench sweep ...

Provenance

License & attribution

These artifacts derive directly from nvidia/NitroGen weights β€” same NVIDIA Open Model License applies. We've only added QDQ instrumentation; the underlying weights are unchanged.

The NitroGen model itself is the work of NVIDIA / MineDojo. This repo only republishes a quantized export for benchmark convenience.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support