custom vllm v0.13.0/CUDA13 image
#4
by
reneho
- opened
I have created a custom vllm image based on vllm v0.13.0 and CUDA 13.0. It can be fetched from
docker pull renehonig/vllm-openai:custom-vllm-0.13.0-cuda-13.0-py-3.12
It solved some of the warning and fallback I observed in the v0.12.0 image.
Btw, Minimax-M2.1 is expected soon...
builds script:
DOCKER_BUILDKIT=1 \
docker buildx build . \
--target vllm-openai \
--tag vllm/vllm-openai:custom-vllm-0.13.0-cuda-13.0-py-3.12 \
--file docker/Dockerfile \
--platform linux/amd64 \
--build-arg max_jobs=112 \
--build-arg nvcc_threads=56 \
--build-arg CUDA_VERSION=13.0.0 \
--build-arg PYTHON_VERSION=3.12 \
--build-arg VLLM_USE_PRECOMPILED=false \
--build-arg VLLM_MAIN_CUDA_VERSION=13 \
--build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.0-devel-ubuntu22.04 \
--build-arg RUN_WHEEL_CHECK=false \
;