Upload app.py with huggingface_hub fb0c49e Running verified multimodalart HF Staff commited on about 16 hours ago
Calibrate GPU duration to ~18s/step at 1024² (measured 13-15s) 1b2d4d4 verified multimodalart HF Staff commited on about 16 hours ago
Bump GPU duration budget for NVFP4 dequant overhead b98d01b verified multimodalart HF Staff commited on about 16 hours ago
Access _ATEN_OP_TABLE on NVFP4Tensor directly ac4a7e1 verified multimodalart HF Staff commited on about 16 hours ago
Wrap NVFP4 matmul handlers to upcast non-NVFP4 inputs to weight.orig_dtype fbc83a5 verified multimodalart HF Staff commited on about 16 hours ago
Also register inner-tensor cuda_aliases so packer captures NVFP4 inner storages c406a24 verified multimodalart HF Staff commited on about 17 hours ago
Patch ZeroGPU empty_fake to recurse into wrapper subclasses (NVFP4Tensor pack) 86bd9cc verified multimodalart HF Staff commited on about 17 hours ago
Register aten.empty_like on NVFP4Tensor so ZeroGPU can pack quantized weights 0d3980c verified multimodalart HF Staff commited on about 20 hours ago
Use NVFP4WeightOnlyConfig (correct torchao class name) 921f3f9 verified multimodalart HF Staff commited on about 21 hours ago
Switch NVFP4 backend to torchao (diffusers↔modelopt cfg dict↔list drift) c77828b verified multimodalart HF Staff commited on about 21 hours ago
Bypass diffusers↔modelopt config builder; pass NVFP4_DEFAULT_CFG directly db9d038 verified multimodalart HF Staff commited on about 21 hours ago
Initial NVFP4 demo for Cosmos3-Super-Text2Image 046dae3 verified multimodalart HF Staff commited on about 21 hours ago