Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
Alovestocode
/
ZeroGPU-LLM-Inference
like
0
Sleeping
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
e9f4b24
ZeroGPU-LLM-Inference
137 kB
Ctrl+K
Ctrl+K
1 contributor
History:
58 commits
Alikestocode
Try alternative oneshot() API parameter names
e9f4b24
5 months ago
.dockerignore
Safe
104 Bytes
Add Google Cloud Platform deployment configurations
5 months ago
.gitattributes
Safe
1.52 kB
Initial commit: ZeroGPU LLM Inference Space
5 months ago
.gitignore
Safe
27 Bytes
Add .gitignore and remove cache files
5 months ago
DEPLOYMENT_STATUS.md
Safe
2.21 kB
Add deployment status document after re-authentication
5 months ago
Dockerfile
Safe
1.02 kB
Fix delete_revisions import with fallback cache cleanup
5 months ago
FIX_PERMISSIONS.md
Safe
2.05 kB
Add permission fix guide for spherical-gate-477614-q7 project
5 months ago
LLM_COMPRESSOR_FEATURES.md
Safe
6.24 kB
Fix AWQModifier import path: use modifiers.awq instead of modifiers.quantization
5 months ago
MANUAL_DEPLOY.md
Safe
1.59 kB
Fix delete_revisions import with fallback cache cleanup
5 months ago
QUANTIZE_AWQ.md
Safe
3.21 kB
Fix AWQModifier import path: use modifiers.awq instead of modifiers.quantization
5 months ago
README.md
Safe
4.23 kB
Implement vLLM with LLM Compressor and performance optimizations
5 months ago
app.py
Safe
40.7 kB
Fix AWQModifier import path: use modifiers.awq instead of modifiers.quantization
5 months ago
apt.txt
Safe
11 Bytes
Initial commit: ZeroGPU LLM Inference Space
5 months ago
cloudbuild.yaml
Safe
1.36 kB
Add Cloud Build deployment script and permission setup helper
5 months ago
deploy-cloud-build.sh
Safe
3.31 kB
Add Cloud Build deployment script and permission setup helper
5 months ago
deploy-compute-engine.sh
Safe
4.23 kB
Add Google Cloud Platform deployment configurations
5 months ago
deploy-gcp.sh
Safe
2.67 kB
Add Google Cloud Platform deployment configurations
5 months ago
gcp-deployment.md
Safe
5.32 kB
Add Google Cloud Platform deployment configurations
5 months ago
quantize_to_awq_colab.ipynb
Safe
30.3 kB
Try alternative oneshot() API parameter names
5 months ago
requirements.txt
Safe
397 Bytes
Clarify LLM Compressor optional status - vLLM has native AWQ support
5 months ago
setup-gcp-permissions.sh
Safe
1.8 kB
Add Cloud Build deployment script and permission setup helper
5 months ago
style.css
Safe
2.84 kB
Initial commit: ZeroGPU LLM Inference Space
5 months ago
test_api.py
Safe
3.43 kB
Migrate to AWQ quantization with FlashAttention-2
5 months ago
test_api_gradio_client.py
Safe
7.2 kB
Implement vLLM with LLM Compressor and performance optimizations
5 months ago
test_quantization_notebook.py
Safe
9.11 kB
Add local test script for quantization notebook validation
5 months ago
test_space_simple.sh
Safe
1.68 kB
Fix delete_revisions import with fallback cache cleanup
5 months ago