fix: load LoRA adapter weights to CPU for ZeroGPU startup compat e8c46ef Kasualdad commited on 23 days ago
Switch inference to transformers + bnb-4bit + LoRA for ZeroGPU de794a7 Kasualdad commited on 23 days ago
Switch from Zero GPU to T4: remove Dockerfile, simplify theme 53a83b7 Kasualdad commited on 24 days ago
fix: remove nonexistent fine-tuned model reference to prevent double-linking on HF Space 14ee1d7 Kasualdad commited on 29 days ago
fix: restore GPU offload on Zero GPU (emulation mode handles module-level CUDA) 47f650c Kasualdad commited on 29 days ago
fix: CPU-only on Zero GPU (model persists in RAM between queries), 8 threads, restore eager load 7a51b3e Kasualdad commited on 29 days ago
fix: preload ALL CUDA .so files, add nvidia-cublas-cu12 + nvidia-cusparse-cu12 a963ee6 Kasualdad commited on 29 days ago
fix: preload libcudart.so.12 via ctypes (LD_LIBRARY_PATH has no effect at runtime) adf4793 Kasualdad commited on 29 days ago
fix: resolve CUDA runtime from pip package, add GPU→CPU fallback a6f6cbd Kasualdad commited on 29 days ago
Optimize for free HF Space: n_threads=2, remove threading timeout overhead 4c667d9 Kasualdad commited on 29 days ago