Spaces:

silveroxides
/

convert_to_quant

Build error

temp: Disable MXFP8/NVFP4, remove comfy-kitchen (build failure)

aa41047 about 2 months ago

982 Bytes

	# TODO: MXFP8/NVFP4 Support

	## Status: Temporarily Disabled

	MXFP8 and NVFP4 quantization formats are temporarily disabled due to build issues with comfy-kitchen on HuggingFace Space infrastructure.

	## Issue

	The comfy-kitchen CUDA build fails due to a CUDA 12.9/glibc header conflict:
	- `cospi`/`sinpi` function exception specification mismatch between CUDA's `math_functions.h` and system headers

	## Planned Resolution

	Options being considered:
	1. Pre-built wheel: Host a pre-compiled comfy-kitchen wheel
	2. Custom Dockerfile: Build comfy-kitchen in a controlled environment
	3. PyTorch fallback: Implement pure PyTorch quantization as fallback

	## Currently Available Formats

	- FP8 Tensorwise (per-tensor scaling)
	- FP8 Block (per-block scaling, 64 or 128 block size)
	- INT8 Block (Triton-based, 128 block size)

	## Reference

	- comfy-kitchen branch: `sc_mm_mxfp8_sync`
	- MXFP8 requires SM >= 10.0 (Blackwell GPU)
	- NVFP4 requires SM >= 10.0/12.0 (Blackwell GPU)