fix: increase max_tokens to 2048 for long translations c42eb37 verified hugh007 commited on 21 days ago
fix: use pre-compiled llama-cpp-python wheel + model in image 109e74f verified hugh007 commited on 21 days ago
fix: use pre-compiled llama-server binary (zero compilation) ef4cebf verified hugh007 commited on 21 days ago
fix: use ninja-build + CMAKE_ARGS for llama-cpp-python build 272fa57 verified hugh007 commited on 21 days ago