Spaces:
Running on Zero
A newer version of the Gradio SDK is available: 6.14.0
Docker Build Scripts
This directory contains helper scripts used during the Docker image build process.
Files
patch_sageattention.py
Purpose: Patches the SageAttention setup.py to support building without GPU present.
What it does:
- Adds support for the
TORCH_CUDA_ARCH_LISTenvironment variable to SageAttention - Allows specifying target GPU architectures via environment variable
- Enables building Docker images on machines without NVIDIA GPUs
Usage (automatically called during Docker build):
cd SageAttention
python3 ../docker/patch_sageattention.py
Why it's needed:
SageAttention's original setup.py tries to detect GPU hardware during build time using torch.cuda.device_count(). This fails in Docker builds because:
- Docker builds don't have GPU access by default (even with
--gpus all) - GPU access during build is not guaranteed across all Docker configurations
- Build machines may not have the same GPU as the target runtime machine
The patch adds a check for TORCH_CUDA_ARCH_LIST environment variable before attempting hardware detection, allowing explicit specification of target architectures.
sageattention_setup.patch (not used)
Legacy patch file - kept for reference. The Python script approach is preferred.
How the Build Process Works
- Environment Setup:
TORCH_CUDA_ARCH_LISTis set in Dockerfile via ARG/ENV - Patch Application:
patch_sageattention.pymodifies SageAttention's setup.py - Extension Build: Modified setup.py reads
TORCH_CUDA_ARCH_LISTand compiles for specified architectures - SpargeAttn Build: Already supports
TORCH_CUDA_ARCH_LISTnatively, no patch needed
Maintenance
If SageAttention is updated, one may need to:
- Check if the patch still applies correctly
- Update the target line in
patch_sageattention.pyif the setup.py structure changes - Test the build process with the new version
The patch is designed to be non-intrusive and should work across most SageAttention versions that follow the same setup.py structure.