Multi-stage Docker build: Stage 1 compiles llama-cpp-python to wheel, Stage 2 installs pre-built wheel - NO TIMEOUT! Pre-download fast-chat model at build time. 3274ec4 Sabithulla commited on Feb 23
Switch to Ollama for zero-compilation deployment - pre-downloads models at startup 64f495c Sabithulla commited on Feb 23