Spaces:

Aatricks
/

LightDiffusion-Next

Running on Zero

Adds support for the TORCH_CUDA_ARCH_LIST environment variable to SageAttention
Allows specifying target GPU architectures via environment variable
Enables building Docker images on machines without NVIDIA GPUs

Usage (automatically called during Docker build):

cd SageAttention
python3 ../docker/patch_sageattention.py

Why it's needed: SageAttention's original setup.py tries to detect GPU hardware during build time using torch.cuda.device_count(). This fails in Docker builds because:

Docker builds don't have GPU access by default (even with --gpus all)
GPU access during build is not guaranteed across all Docker configurations
Build machines may not have the same GPU as the target runtime machine

The patch adds a check for TORCH_CUDA_ARCH_LIST environment variable before attempting hardware detection, allowing explicit specification of target architectures.

sageattention_setup.patch (not used)

Legacy patch file - kept for reference. The Python script approach is preferred.

How the Build Process Works

Environment Setup: TORCH_CUDA_ARCH_LIST is set in Dockerfile via ARG/ENV
Patch Application: patch_sageattention.py modifies SageAttention's setup.py
Extension Build: Modified setup.py reads TORCH_CUDA_ARCH_LIST and compiles for specified architectures
SpargeAttn Build: Already supports TORCH_CUDA_ARCH_LIST natively, no patch needed

Maintenance

If SageAttention is updated, one may need to:

Check if the patch still applies correctly
Update the target line in patch_sageattention.py if the setup.py structure changes
Test the build process with the new version

The patch is designed to be non-intrusive and should work across most SageAttention versions that follow the same setup.py structure.