Multi-stage Docker build: Stage 1 compiles llama-cpp-python once, Stage 2 reuses compiled wheels - NO TIMEOUT! Build time 8-12 minutes first time, then cached. 9d2777a Sabithulla commited on 29 days ago
Multi-stage Docker build: Stage 1 compiles llama-cpp-python to wheel, Stage 2 installs pre-built wheel - NO TIMEOUT! Pre-download fast-chat model at build time. 3274ec4 Sabithulla commited on 29 days ago
Switch to Ollama for zero-compilation deployment - pre-downloads models at startup 64f495c Sabithulla commited on 29 days ago
Switch to pre-compiled llama-cpp-python binary wheels - no compilation timeout on HF Spaces 939e78c Sabithulla commited on 29 days ago
Revert to llama-cpp-python with storage optimization - only load fast-chat at startup 1454974 Sabithulla commited on 29 days ago
Switch to ctransformers (pre-built, no compilation!) - faster HF Spaces deploy cf04577 Sabithulla commited on 29 days ago
Add build-essential & cmake for llama-cpp-python compilation e57d268 Sabithulla commited on 29 days ago
Optimize: use --prefer-binary for wheels on Linux but not committed 639666f Sabithulla commited on 29 days ago
Remove llama-cpp-python from requirements - already in base image 9f226af Sabithulla commited on 29 days ago
Use pre-built llama-cpp-python Docker image - skip compilation e975c5c Sabithulla commited on 29 days ago
Fix: revert to standard pip install (binary-only broke uvicorn install) 96b77f2 Sabithulla commited on 29 days ago
Force binary wheels only - skip all compilation for fast HF Spaces build 3323511 Sabithulla commited on 29 days ago
Optimize Dockerfile: skip expensive llama-cpp-python compilation flags a9145dc Sabithulla commited on 29 days ago