tahkik-basic-warsh / Dockerfile
benhadjermed's picture
Migrate to faster-whisper with INT8 quantization for ~4x speedup
90b0434 verified
# ── Tahkik Inference Space ──────────────────────────────────────────────────
# Uses faster-whisper (CTranslate2 INT8) for ~4x faster inference vs PyTorch.
# To enable GPU (T4/L4/A100), change the base image to:
# FROM nvidia/cuda:12.1-runtime-ubuntu22.04
# and set compute_type="float16" in main.py.
# ---------------------------------------------------------------------------
FROM python:3.10-slim
# HF Spaces requires a non-root user with UID 1000.
RUN useradd -m -u 1000 user
WORKDIR /home/user/app
# Install dependencies as root (before switching user).
COPY --chown=user requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code.
COPY --chown=user . .
# Redirect all model/cache downloads to /tmp (only writable path in Spaces).
ENV HF_HOME=/tmp/huggingface_cache
ENV HF_HUB_DISABLE_PROGRESS_BARS=1
ENV CT2_VERBOSE=0
USER user
EXPOSE 7860
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "7860"]