Spaces:
Sleeping
Sleeping
Drop +PTX from CUDA arch — shrink fused_knn_attn compile under HF builder RAM
Browse files- Dockerfile +6 -6
Dockerfile
CHANGED
|
@@ -17,12 +17,12 @@ FROM nvidia/cuda:12.8.0-devel-ubuntu22.04
|
|
| 17 |
ENV DEBIAN_FRONTEND=noninteractive \
|
| 18 |
PYTHONUNBUFFERED=1 \
|
| 19 |
PIP_NO_CACHE_DIR=1 \
|
| 20 |
-
# Build CUDA kernels for the A10G (compute 8.6)
|
| 21 |
-
#
|
| 22 |
-
#
|
| 23 |
-
TORCH_CUDA_ARCH_LIST="8.6
|
| 24 |
-
# Cap parallel nvcc jobs
|
| 25 |
-
#
|
| 26 |
MAX_JOBS=2
|
| 27 |
|
| 28 |
# Python 3.12 (via deadsnakes) — optgs uses PEP 695 generic syntax that
|
|
|
|
| 17 |
ENV DEBIAN_FRONTEND=noninteractive \
|
| 18 |
PYTHONUNBUFFERED=1 \
|
| 19 |
PIP_NO_CACHE_DIR=1 \
|
| 20 |
+
# Build CUDA kernels for exactly the A10G (compute 8.6) — no extra arches
|
| 21 |
+
# and no PTX. The HF Docker builder has limited RAM, and every extra
|
| 22 |
+
# codegen target pushes a single nvcc compile toward an OOM kill.
|
| 23 |
+
TORCH_CUDA_ARCH_LIST="8.6" \
|
| 24 |
+
# Cap parallel nvcc jobs so multi-file extensions (gsplat, nerfacc) don't
|
| 25 |
+
# OOM the builder.
|
| 26 |
MAX_JOBS=2
|
| 27 |
|
| 28 |
# Python 3.12 (via deadsnakes) — optgs uses PEP 695 generic syntax that
|