CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python

Clonar git


Para ter o controle total que você busca (dominar a operação):

O ideal é clonar o repositório oficial: git clone https://github.com/ggerganov/llama.cpp.

Substituir/Modificar os arquivos .cpp e .h com a sua lógica da OFFELLIA.

Compilar manualmente usando make ou cmake.

cd llama.cpp


cmake

cmake --build . --config Release -j$(nproc)