Spaces:

DarkMindForever
/

qwerty

Configuration error

DarkMindForever commited on Jan 14

Commit

a9cb600

verified ·

1 Parent(s): f6cab9d

Create Dockerfile

Files changed (1) hide show

Dockerfile ADDED Viewed

+# High-performance Qwen Vision server
+FROM ghcr.io/ggml-org/llama.cpp:server
+USER root
+RUN apt-get update && apt-get install -y curl
+# Download Qwen2-VL Model and Projector
+RUN mkdir -p /models && \
+    curl -L https://huggingface.co/mradermacher/Nanonets-OCR2-1.5B-exp-GGUF/resolve/main/Nanonets-OCR2-1.5B-exp.Q4_K_M.gguf -o /models/model.gguf && \
+    chown -R 1000:1000 /models
+USER 1000
+# Server Configuration
+ENV LLAMA_ARG_MODEL=/models/model.gguf
+ENV LLAMA_ARG_HOST=0.0.0.0
+ENV LLAMA_ARG_PORT=7860
+ENV LLAMA_ARG_CTX_SIZE=8192
+ENV LLAMA_ARG_THREADS=8
+ENV LLAMA_ARG_CONT_BATCHING=true
+# Performance Tuning
+ENV LLAMA_ARG_BATCH_SIZE=2048
+ENV LLAMA_ARG_UBATCH_SIZE=512
+HEALTHCHECK --interval=30s --timeout=15s --start-period=10s --retries=3 \
+    CMD curl -f http://localhost:7860/health || exit 1
+ENTRYPOINT ["/app/llama-server"]