feat(runner.sh): only enable prefix caching and disable log request c0cde8e yusufs commited on Jan 28, 2025
feat(runner.sh): --enable-chunked-prefill and --enable-prefix-caching for faster generate 8c5a84b yusufs commited on Jan 28, 2025
fix(runner.sh): disable eager-loading so it using cuda graph (in order for parallel and faster processing) 6bb48e9 yusufs commited on Jan 20, 2025
feat(runner.sh): using runner.sh to select llm in the run time 69c6372 yusufs commited on Dec 26, 2024