Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
Alovestocode
/
ZeroGPU-LLM-Inference
like
0
Sleeping
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
aa65d00
ZeroGPU-LLM-Inference
68.3 kB
1 contributor
History:
25 commits
Alikestocode
Add Google Cloud Platform deployment configurations
aa65d00
3 months ago
.dockerignore
104 Bytes
Add Google Cloud Platform deployment configurations
3 months ago
.gitattributes
1.52 kB
Initial commit: ZeroGPU LLM Inference Space
3 months ago
.gitignore
27 Bytes
Add .gitignore and remove cache files
3 months ago
Dockerfile
680 Bytes
Add Google Cloud Platform deployment configurations
3 months ago
README.md
4.23 kB
Implement vLLM with LLM Compressor and performance optimizations
3 months ago
app.py
34.5 kB
Fix Gradio UI structure and add comprehensive fallback logging
3 months ago
apt.txt
11 Bytes
Initial commit: ZeroGPU LLM Inference Space
3 months ago
cloudbuild.yaml
1.34 kB
Add Google Cloud Platform deployment configurations
3 months ago
deploy-compute-engine.sh
4.23 kB
Add Google Cloud Platform deployment configurations
3 months ago
deploy-gcp.sh
2.67 kB
Add Google Cloud Platform deployment configurations
3 months ago
gcp-deployment.md
5.32 kB
Add Google Cloud Platform deployment configurations
3 months ago
requirements.txt
197 Bytes
Implement vLLM with LLM Compressor and performance optimizations
3 months ago
style.css
2.84 kB
Initial commit: ZeroGPU LLM Inference Space
3 months ago
test_api.py
3.43 kB
Migrate to AWQ quantization with FlashAttention-2
3 months ago
test_api_gradio_client.py
7.2 kB
Implement vLLM with LLM Compressor and performance optimizations
3 months ago