Spaces:

gpue
/

foundationpose

Sleeping

App Files Files Community

Georg commited on 7 days ago

Commit

bbc3fdc

1 Parent(s): 0d59dc5

Two-stage Docker build: base image + GPU compilation

Browse files

Files changed (5) hide show

BUILD.md +171 -0
Dockerfile +7 -62
Dockerfile.base +65 -0
build_base.sh +20 -0
deploy.sh +107 -0

BUILD.md ADDED Viewed

	@@ -0,0 +1,171 @@

+# Two-Stage Docker Build
+FoundationPose requires CUDA C++ extensions that must be compiled with a GPU present. To enable local development and avoid build failures, we use a two-stage build process:
+## Architecture
+```
+┌─────────────────────────────────────────────────┐
+│ Stage 1: Base Image (Local, No GPU)            │
+│ - Install system dependencies                   │
+│ - Install PyTorch + Python packages             │
+│ - Clone FoundationPose repository               │
+│ - Patch setup.py for C++17                     │
+│ Build locally → Push to DockerHub              │
+└─────────────────────────────────────────────────┘
+                      ↓
+┌─────────────────────────────────────────────────┐
+│ Stage 2: Final Image (HuggingFace, GPU)        │
+│ FROM gpue/foundationpose-base:latest           │
+│ - Compile mycuda C++ extension (needs GPU)     │
+│ - Compile mycpp C++ extension                  │
+│ - Download model weights                       │
+│ Build on HuggingFace Spaces with GPU           │
+└─────────────────────────────────────────────────┘
+```
+## Files
+- **Dockerfile.base** - Stage 1: Base image without GPU compilation
+- **Dockerfile** - Stage 2: Final image that compiles CUDA extensions
+- **build_base.sh** - Build base image only
+- **deploy.sh** - Full two-stage deployment
+## Quick Start
+### Option 1: Automated Deployment
+```bash
+cd /Users/georgpuschel/repos/robot-ml/foundationpose
+# Login to DockerHub
+docker login
+# Run full deployment (builds base, pushes to DockerHub, deploys to HF, follows logs)
+./deploy.sh
+```
+The script will:
+1. Build the base image for linux/amd64
+2. Push to DockerHub as `gpue/foundationpose-base:latest`
+3. Commit and push changes to HuggingFace
+4. Automatically follow the HuggingFace build logs (press Ctrl+C to stop)
+**Prerequisites**:
+- Docker login credentials for DockerHub
+- HuggingFace token in `../training/.env.local` (for following logs)
+### Option 2: Manual Steps
+#### Step 1: Build and Push Base Image
+```bash
+# Build for linux/amd64 (HuggingFace platform)
+docker build --platform linux/amd64 -f Dockerfile.base -t gpue/foundationpose-base:latest .
+# Push to DockerHub
+docker login
+docker push gpue/foundationpose-base:latest
+```
+#### Step 2: Deploy to HuggingFace
+```bash
+# Add the git remote (if not already added)
+git remote add hf https://huggingface.co/spaces/gpue/foundationpose
+# Push updated Dockerfile to HuggingFace
+git add Dockerfile Dockerfile.base
+git commit -m "Two-stage build: base image + GPU compilation"
+git push hf main
+```
+HuggingFace will automatically:
+1. Pull `gpue/foundationpose-base:latest` from DockerHub
+2. Compile C++ extensions with GPU present
+3. Download model weights
+4. Start the application
+## Why Two Stages?
+**Problem**: CUDA C++ extensions fail to compile without a GPU:
+```
+TypeError: expected string or bytes-like object
+torch.version.cuda returns None when no GPU is present
+```
+**Solution**:
+- Stage 1 (Base): Everything except GPU compilation - can build anywhere
+- Stage 2 (Final): Only GPU-dependent compilation - builds on HuggingFace with GPU
+**Benefits**:
+- ✓ Build base image locally without GPU
+- ✓ Faster HuggingFace builds (base layers cached)
+- ✓ Easy to iterate on application code (stage 2 is small)
+- ✓ Base image can be reused across multiple projects
+## Troubleshooting
+### Platform Mismatch
+If you're on Apple Silicon (M1/M2/M3), you must specify `--platform linux/amd64`:
+```bash
+docker build --platform linux/amd64 -f Dockerfile.base -t gpue/foundationpose-base:latest .
+```
+### DockerHub Authentication
+```bash
+docker login
+# Enter username: gpue
+# Enter password: <your-dockerhub-token>
+```
+### Verify Base Image
+Test the base image locally (without GPU compilation):
+```bash
+docker run --rm -it gpue/foundationpose-base:latest bash
+# Inside container:
+python3 -c "import torch; print(torch.__version__)"
+ls /app/FoundationPose
+cat /app/FoundationPose/bundlesdf/mycuda/setup.py | grep "c++17"
+```
+### HuggingFace Build Logs
+Monitor the build on HuggingFace:
+```bash
+curl -H "Authorization: Bearer $HF_TOKEN" \
+  "https://huggingface.co/api/spaces/gpue/foundationpose/logs/build"
+```
+## Updating the Base Image
+When you need to change dependencies or system packages:
+1. Edit `Dockerfile.base`
+2. Rebuild and push:
+   ```bash
+   ./build_base.sh
+   docker push gpue/foundationpose-base:latest
+   ```
+3. Rebuild on HuggingFace (will pull updated base automatically)
+## Local Testing (Without GPU)
+To test the base image without GPU compilation:
+```bash
+# Build base
+docker build -f Dockerfile.base -t foundationpose-base-test .
+# Run without compiling extensions
+docker run --rm -p 7860:7860 foundationpose-base-test python3 app.py
+```
+The app will run in placeholder mode (no real inference) but you can test the UI and API endpoints.

Dockerfile CHANGED Viewed

@@ -1,76 +1,21 @@
-FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04
-# Set environment variables
-ENV DEBIAN_FRONTEND=noninteractive
-ENV CUDA_HOME=/usr/local/cuda
-ENV PATH=${CUDA_HOME}/bin:${PATH}
-ENV LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}
 # FoundationPose configuration - always use real model
 ENV FOUNDATIONPOSE_MODEL_REPO=gpue/foundationpose-weights
 ENV USE_REAL_MODEL=true
-# CUDA architecture list for building extensions without GPU present
-# Covers most modern GPUs: Turing (75), Ampere (80,86), Ada (89), Hopper (90)
-ENV TORCH_CUDA_ARCH_LIST="7.5;8.0;8.6;8.9;9.0"
-# Install system dependencies
-RUN apt-get update && apt-get install -y \
-    git \
-    wget \
-    cmake \
-    build-essential \
-    python3.10 \
-    python3.10-dev \
-    python3-pip \
-    libgl1-mesa-glx \
-    libglib2.0-0 \
-    libsm6 \
-    libxext6 \
-    libxrender-dev \
-    libgomp1 \
-    libeigen3-dev \
-    ninja-build \
-    && rm -rf /var/lib/apt/lists/*
-# Set python3.10 as default
-RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1
-RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.10 1
-# Upgrade pip
-RUN python3 -m pip install --upgrade pip
-# Set working directory
-WORKDIR /app
-# Install Python dependencies first (PyTorch needed for building C++ extensions)
-COPY requirements.txt .
-RUN pip install --no-cache-dir --upgrade setuptools wheel
-RUN pip install --no-cache-dir torch torchvision --index-url https://download.pytorch.org/whl/cu118
-RUN pip install --no-cache-dir -r requirements.txt
-# Clone FoundationPose repository
-RUN git clone https://github.com/NVlabs/FoundationPose.git /app/FoundationPose
-# Build FoundationPose C++ extensions (skip kaolin - optional dependency)
 WORKDIR /app/FoundationPose
-# Patch mycuda setup.py to use C++17 instead of C++14 (PyTorch requires C++17)
-RUN cd bundlesdf/mycuda && \
-    sed -i 's/-std=c++14/-std=c++17/g' setup.py && \
-    pip install . --no-build-isolation
 RUN cd mycpp && python setup.py build_ext --inplace
-# Copy application files
 WORKDIR /app
-COPY app.py client.py estimator.py ./
-# Create weights directory and download model weights
-RUN mkdir -p weights
 RUN python3 -c "from huggingface_hub import snapshot_download; \
 snapshot_download(repo_id='gpue/foundationpose-weights', local_dir='weights', repo_type='model')"
-# Expose Gradio port
-EXPOSE 7860
 # Run the application
 CMD ["python3", "app.py"]

+# Start from base image (build locally, push to DockerHub)
+# To build base: docker build -f Dockerfile.base -t gpue/foundationpose-base:latest .
+# To push base: docker push gpue/foundationpose-base:latest
+FROM gpue/foundationpose-base:latest
 # FoundationPose configuration - always use real model
 ENV FOUNDATIONPOSE_MODEL_REPO=gpue/foundationpose-weights
 ENV USE_REAL_MODEL=true
+# Build FoundationPose C++ extensions (requires GPU present)
 WORKDIR /app/FoundationPose
+RUN cd bundlesdf/mycuda && pip install . --no-build-isolation
 RUN cd mycpp && python setup.py build_ext --inplace
+# Download model weights from HuggingFace
 WORKDIR /app
 RUN python3 -c "from huggingface_hub import snapshot_download; \
 snapshot_download(repo_id='gpue/foundationpose-weights', local_dir='weights', repo_type='model')"
 # Run the application
 CMD ["python3", "app.py"]

Dockerfile.base ADDED Viewed

	@@ -0,0 +1,65 @@

+FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04
+# Set environment variables
+ENV DEBIAN_FRONTEND=noninteractive
+ENV CUDA_HOME=/usr/local/cuda
+ENV PATH=${CUDA_HOME}/bin:${PATH}
+ENV LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}
+# CUDA architecture list for building extensions without GPU present
+# Covers most modern GPUs: Turing (75), Ampere (80,86), Ada (89), Hopper (90)
+ENV TORCH_CUDA_ARCH_LIST="7.5;8.0;8.6;8.9;9.0"
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    git \
+    wget \
+    cmake \
+    build-essential \
+    python3.10 \
+    python3.10-dev \
+    python3-pip \
+    libgl1-mesa-glx \
+    libglib2.0-0 \
+    libsm6 \
+    libxext6 \
+    libxrender-dev \
+    libgomp1 \
+    libeigen3-dev \
+    ninja-build \
+    && rm -rf /var/lib/apt/lists/*
+# Set python3.10 as default
+RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1
+RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.10 1
+# Upgrade pip
+RUN python3 -m pip install --upgrade pip
+# Set working directory
+WORKDIR /app
+# Copy and install Python dependencies
+COPY requirements.txt .
+RUN pip install --no-cache-dir --upgrade setuptools wheel
+RUN pip install --no-cache-dir torch torchvision --index-url https://download.pytorch.org/whl/cu118
+RUN pip install --no-cache-dir -r requirements.txt
+# Clone FoundationPose repository (but don't build extensions yet)
+RUN git clone https://github.com/NVlabs/FoundationPose.git /app/FoundationPose
+# Patch mycuda setup.py to use C++17 (preparation for GPU build)
+WORKDIR /app/FoundationPose
+RUN cd bundlesdf/mycuda && sed -i 's/-std=c++14/-std=c++17/g' setup.py
+# Reset workdir
+WORKDIR /app
+# Copy application files
+COPY app.py client.py estimator.py ./
+# Create weights directory (weights will be downloaded in final image)
+RUN mkdir -p weights
+# Expose Gradio port
+EXPOSE 7860

build_base.sh ADDED Viewed

	@@ -0,0 +1,20 @@

+#!/bin/bash
+# Build and push the base image to DockerHub
+set -e
+IMAGE_NAME="gpue/foundationpose-base"
+TAG="latest"
+PLATFORM="linux/amd64"  # HuggingFace Spaces use x86_64
+echo "Building base image for ${PLATFORM}: ${IMAGE_NAME}:${TAG}"
+docker build --platform ${PLATFORM} -f Dockerfile.base -t ${IMAGE_NAME}:${TAG} .
+echo ""
+echo "✓ Base image built successfully for ${PLATFORM}"
+echo ""
+echo "To push to DockerHub:"
+echo "  docker login"
+echo "  docker push ${IMAGE_NAME}:${TAG}"
+echo ""
+echo "After pushing, the HuggingFace Space will use this base image."

deploy.sh ADDED Viewed

	@@ -0,0 +1,107 @@

+#!/bin/bash
+# Two-stage deployment script for FoundationPose
+set -e
+IMAGE_NAME="gpue/foundationpose-base"
+TAG="latest"
+PLATFORM="linux/amd64"
+HF_SPACE="gpue/foundationpose"
+HF_TOKEN_FILE="../training/.env.local"
+echo "==================================="
+echo "FoundationPose Two-Stage Deployment"
+echo "==================================="
+echo ""
+# Stage 1: Build and push base image (local, no GPU needed)
+echo "Stage 1: Building base image locally (no GPU required)"
+echo "Platform: ${PLATFORM}"
+echo "Image: ${IMAGE_NAME}:${TAG}"
+echo ""
+# Check Docker login
+if [ ! -f ~/.docker/config.json ] || ! grep -q "index.docker.io" ~/.docker/config.json 2>/dev/null; then
+    echo "Error: Not logged in to DockerHub"
+    echo "Please run: docker login"
+    exit 1
+fi
+echo "✓ DockerHub authentication verified"
+echo ""
+docker build --platform ${PLATFORM} -f Dockerfile.base -t ${IMAGE_NAME}:${TAG} .
+echo ""
+echo "✓ Base image built successfully"
+echo ""
+echo "Pushing to DockerHub..."
+docker push ${IMAGE_NAME}:${TAG}
+echo ""
+echo "✓ Base image pushed to DockerHub: ${IMAGE_NAME}:${TAG}"
+echo ""
+# Stage 2: Deploy to HuggingFace
+echo "Stage 2: Deploying to HuggingFace Space"
+echo ""
+# Initialize git repo if needed
+if [ ! -d .git ]; then
+    echo "Initializing git repository..."
+    git init
+    git remote add origin https://huggingface.co/spaces/${HF_SPACE}
+    echo "✓ Git repository initialized"
+    echo ""
+fi
+# Check if there are changes to commit
+if [[ -n $(git status -s) ]]; then
+    echo "Committing changes..."
+    git add Dockerfile Dockerfile.base BUILD.md build_base.sh deploy.sh
+    git commit -m "Two-stage Docker build: base image + GPU compilation"
+    echo "✓ Changes committed"
+else
+    echo "No changes to commit"
+fi
+# Push to HuggingFace
+echo ""
+echo "Pushing to HuggingFace Space: ${HF_SPACE}"
+git push https://huggingface.co/spaces/${HF_SPACE} main --force
+echo ""
+echo "✓ Pushed to HuggingFace"
+echo ""
+echo "HuggingFace will now:"
+echo "  1. Pull the base image from DockerHub (${IMAGE_NAME}:${TAG})"
+echo "  2. Build C++ extensions with GPU present"
+echo "  3. Download model weights"
+echo "  4. Start the Gradio app"
+echo ""
+# Follow build logs
+echo "Following build logs..."
+echo "Press Ctrl+C to stop watching"
+echo ""
+# Load HF token
+if [ -f "${HF_TOKEN_FILE}" ]; then
+    HF_TOKEN=$(grep "^HUGGINGFACE_TOKEN=" "${HF_TOKEN_FILE}" | cut -d'=' -f2)
+    if [ -n "${HF_TOKEN}" ]; then
+        curl -N -H "Authorization: Bearer ${HF_TOKEN}" \
+            "https://huggingface.co/api/spaces/${HF_SPACE}/logs/build" 2>/dev/null | \
+            while IFS= read -r line; do
+                # Parse JSON and extract data field
+                echo "$line" | grep -o '"data":"[^"]*"' | sed 's/"data":"//;s/"$//' | sed 's/\\n/\n/g'
+            done
+    else
+        echo "Warning: HF_TOKEN not found in ${HF_TOKEN_FILE}"
+        echo "To follow logs manually:"
+        echo "  curl -N -H \"Authorization: Bearer \$HF_TOKEN\" \"https://huggingface.co/api/spaces/${HF_SPACE}/logs/build\""
+    fi
+else
+    echo "Warning: ${HF_TOKEN_FILE} not found"
+    echo "To follow logs manually:"
+    echo "  curl -N -H \"Authorization: Bearer \$HF_TOKEN\" \"https://huggingface.co/api/spaces/${HF_SPACE}/logs/build\""
+fi