custom vllm v0.13.0/CUDA13 image

#4
by reneho - opened

I have created a custom vllm image based on vllm v0.13.0 and CUDA 13.0. It can be fetched from

docker pull renehonig/vllm-openai:custom-vllm-0.13.0-cuda-13.0-py-3.12

It solved some of the warning and fallback I observed in the v0.12.0 image.
Btw, Minimax-M2.1 is expected soon...

builds script:

DOCKER_BUILDKIT=1 \
    docker buildx build . \
        --target vllm-openai \
        --tag vllm/vllm-openai:custom-vllm-0.13.0-cuda-13.0-py-3.12 \
        --file docker/Dockerfile \
        --platform linux/amd64 \
        --build-arg max_jobs=112 \
        --build-arg nvcc_threads=56 \
        --build-arg CUDA_VERSION=13.0.0 \
        --build-arg PYTHON_VERSION=3.12 \
        --build-arg VLLM_USE_PRECOMPILED=false \
        --build-arg VLLM_MAIN_CUDA_VERSION=13 \
        --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.0-devel-ubuntu22.04 \
        --build-arg RUN_WHEEL_CHECK=false \
;

Sign up or log in to comment