This repository contains libraries that pre-built locally.

My purpose is developing private spaces for free access to models via Gradio using CPU-only, so that models cannot be downloaded but can be served.

In CPU-only spaces, the llama-cpp-python online build fails due to OOM / timeout

Hence, built it locally under Ubuntu 24.04, for the environment that as of 2026-05-15 was shown within the HF Space:


3.13.13 (main, May  8 2026, 22:42:09) [GCC 14.2.0]
x86_64
('glibc', '2.41')
3.13.13 (main, May  8 2026, 22:42:09) [GCC 14.2.0]
x86_64
('glibc', '2.41')

The script (as temporary app.py, with no requirement.txt) to verify the hf default configuration for a CPU-only private space is: import sys, platform print(sys.version) # e.g. 3.11.9 print(platform.machine()) # x86_64 print(platform.libc_ver()) # glibc 2.35 → Ubuntu 22.04

First library built: _ llama_cpp_python-0.3.23-py3-none-linux_x86_64.whl

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support