Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
yusufs
/
vllm-inference
like
0
Paused
App
Files
Files
Fetching metadata from the HF Docker repository...
main
vllm-inference
477 kB
1 contributor
History:
57 commits
yusufs
feat(runner.sh): DeepSeek-R1-Distill-Qwen-32B d66bcfc2f3fd52799f95943264f32ba15ca0003d
148829b
about 1 year ago
.gitignore
19 Bytes
feat(download_model.py): remove download_model.py during build, it causing big image size
over 1 year ago
Dockerfile
1.44 kB
feat(Dockerfile): install gcc
about 1 year ago
README.md
1.73 kB
feat(add-model): always download model during build, it will be cached in the consecutive builds
over 1 year ago
download_model.py
700 Bytes
feat(add-model): always download model during build, it will be cached in the consecutive builds
over 1 year ago
main.py
6.7 kB
feat(parse): parse output
over 1 year ago
openai_compatible_api_server.py
24.4 kB
feat(dep_sizes.txt): removes dep_sizes.txt during build, it not needed
over 1 year ago
poetry.lock
426 kB
feat(refactor): move the files to root
over 1 year ago
pyproject.toml
416 Bytes
feat(refactor): move the files to root
over 1 year ago
requirements.txt
9.99 kB
feat(first-commit): follow examples and tutorials
over 1 year ago
run-llama.sh
1.51 kB
fix(runner.sh): --enforce-eager not support values
about 1 year ago
run-sailor.sh
1.83 kB
fix(runner.sh): --enforce-eager not support values
about 1 year ago
runner.sh
2.04 kB
feat(runner.sh): DeepSeek-R1-Distill-Qwen-32B d66bcfc2f3fd52799f95943264f32ba15ca0003d
about 1 year ago