Spaces:

webmobriltechnologies
/

BoomConnext-demo

Sleeping

Its-OMG

Bumped torch to 2.7+ and torchcodec to 0.3 to restore ASR auto transcribe

a2eb473 19 days ago

2.17 kB

	# BoomConnext-demo HF Space — pre-built React SPA in ./dist/
	FROM nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04

	ENV DEBIAN_FRONTEND=noninteractive
	ENV PYTHONUNBUFFERED=1
	ENV PIP_NO_CACHE_DIR=1
	ENV PIP_DISABLE_PIP_VERSION_CHECK=1

	# System dependencies
	RUN apt-get update && apt-get install -y --no-install-recommends \
	python3.10 python3-pip python3.10-dev \
	git ffmpeg libsndfile1 \
	build-essential \
	&& rm -rf /var/lib/apt/lists/* \
	&& ln -sf /usr/bin/python3.10 /usr/bin/python

	# HF Spaces convention: run as user 1000
	RUN useradd -m -u 1000 user
	USER user
	ENV HOME=/home/user
	ENV PATH=/home/user/.local/bin:$PATH

	WORKDIR $HOME/app

	# Pre-install numpy + Cython so chatterbox-tts's pkuseg dep can build
	RUN pip install --user --upgrade pip setuptools wheel \
	&& pip install --user "numpy>=1.26.0" Cython \
	&& pip install --user --no-build-isolation pkuseg==0.0.25

	# chatterbox-tts==0.1.7 hard-pins transformers==5.2.0 and torch==2.6.0, but
	# OmniVoice needs transformers>=5.3.0 (HiggsAudioV2) and the transformers ASR
	# pipeline calls torchcodec.decoders.AudioDecoder which only exists in
	# torchcodec>=0.3, which in turn needs torch>=2.7. Install chatterbox first
	# so it pulls in vocos/encodec/librosa/etc., then force-upgrade transformers,
	# torch, and torchaudio above their pins with --no-deps. chatterbox uses
	# stable PyTorch APIs and runs fine on torch 2.7 in practice; pip will print
	# a "broken requirement" warning that's safe to ignore here.
	RUN pip install --user chatterbox-tts==0.1.7 \
	&& pip install --user --no-deps --upgrade 'transformers>=5.3.0,<6' \
	&& pip install --user --no-deps --upgrade 'torch>=2.7,<2.8' 'torchaudio>=2.7,<2.8'

	# Install the rest of the Python deps (chatterbox is already satisfied,
	# so it won't be re-resolved here).
	COPY --chown=user requirements.txt ./
	RUN pip install --user -r requirements.txt

	# Backend code + vendored OmniVoice + pre-built React SPA
	COPY --chown=user omnivoice/ ./omnivoice/
	COPY --chown=user main.py ./
	COPY --chown=user dist/ ./dist/

	# HF Spaces uses port 7860
	EXPOSE 7860
	ENV HOST=0.0.0.0
	ENV PORT=7860
	ENV STATIC_DIR=dist

	CMD ["python", "main.py"]