SteEsp commited on
Commit
630cc3c
·
verified ·
1 Parent(s): 6da57aa

Drop +PTX from CUDA arch — shrink fused_knn_attn compile under HF builder RAM

Browse files
Files changed (1) hide show
  1. Dockerfile +6 -6
Dockerfile CHANGED
@@ -17,12 +17,12 @@ FROM nvidia/cuda:12.8.0-devel-ubuntu22.04
17
  ENV DEBIAN_FRONTEND=noninteractive \
18
  PYTHONUNBUFFERED=1 \
19
  PIP_NO_CACHE_DIR=1 \
20
- # Build CUDA kernels for the A10G (compute 8.6) only; +PTX keeps them
21
- # forward-compatible with newer GPUs via driver JIT. Compiling all
22
- # architectures at once OOM-kills the HF builder.
23
- TORCH_CUDA_ARCH_LIST="8.6+PTX" \
24
- # Cap parallel nvcc jobs gsplat's kernels are memory-heavy and the HF
25
- # Docker builder has limited RAM; an unbounded build gets OOM-killed.
26
  MAX_JOBS=2
27
 
28
  # Python 3.12 (via deadsnakes) — optgs uses PEP 695 generic syntax that
 
17
  ENV DEBIAN_FRONTEND=noninteractive \
18
  PYTHONUNBUFFERED=1 \
19
  PIP_NO_CACHE_DIR=1 \
20
+ # Build CUDA kernels for exactly the A10G (compute 8.6) no extra arches
21
+ # and no PTX. The HF Docker builder has limited RAM, and every extra
22
+ # codegen target pushes a single nvcc compile toward an OOM kill.
23
+ TORCH_CUDA_ARCH_LIST="8.6" \
24
+ # Cap parallel nvcc jobs so multi-file extensions (gsplat, nerfacc) don't
25
+ # OOM the builder.
26
  MAX_JOBS=2
27
 
28
  # Python 3.12 (via deadsnakes) — optgs uses PEP 695 generic syntax that