Cosmos-Reason2-2B FP8 (HF checkpoint via ModelOpt)

Source: nvidia/Cosmos-Reason2-2B (Qwen3-VL) Quantization: FP8 on language_model (vision tower bf16) Toolchain: NVIDIA ModelOpt 0.43.0 export_hf_checkpoint Calibration: 8 multimodal (image+text) samples

For TensorRT-LLM / TensorRT-Edge-LLM engine build on Jetson Thor, this checkpoint is the input to trtllm-build --model_dir <this>.

Provenance

  • AWS EC2 NVIDIA L40S, Ubuntu 24.04
  • torch==2.6.0+cu124, transformers==4.57.6, nvidia-modelopt==0.43.0, tensorrt==10.16.1
  • Produced by DevDuck auto-pipeline 2026-05-07
Downloads last month
33
Safetensors
Model size
2B params
Tensor type
BF16
·
F8_E4M3
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cagataydev/cosmos-reason2-2b-fp8-hf

Quantized
(12)
this model