Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Spaces:
Alovestocode
/
ZeroGPU-LLM-Inference
Sleeping

App Files Files Community
Fetching metadata from the HF Docker repository...
ZeroGPU-LLM-Inference
68.3 kB
  • 1 contributor
History: 25 commits
Alikestocode's picture
Alikestocode
Add Google Cloud Platform deployment configurations
aa65d00 3 months ago
  • .dockerignore
    104 Bytes
    Add Google Cloud Platform deployment configurations 3 months ago
  • .gitattributes
    1.52 kB
    Initial commit: ZeroGPU LLM Inference Space 3 months ago
  • .gitignore
    27 Bytes
    Add .gitignore and remove cache files 3 months ago
  • Dockerfile
    680 Bytes
    Add Google Cloud Platform deployment configurations 3 months ago
  • README.md
    4.23 kB
    Implement vLLM with LLM Compressor and performance optimizations 3 months ago
  • app.py
    34.5 kB
    Fix Gradio UI structure and add comprehensive fallback logging 3 months ago
  • apt.txt
    11 Bytes
    Initial commit: ZeroGPU LLM Inference Space 3 months ago
  • cloudbuild.yaml
    1.34 kB
    Add Google Cloud Platform deployment configurations 3 months ago
  • deploy-compute-engine.sh
    4.23 kB
    Add Google Cloud Platform deployment configurations 3 months ago
  • deploy-gcp.sh
    2.67 kB
    Add Google Cloud Platform deployment configurations 3 months ago
  • gcp-deployment.md
    5.32 kB
    Add Google Cloud Platform deployment configurations 3 months ago
  • requirements.txt
    197 Bytes
    Implement vLLM with LLM Compressor and performance optimizations 3 months ago
  • style.css
    2.84 kB
    Initial commit: ZeroGPU LLM Inference Space 3 months ago
  • test_api.py
    3.43 kB
    Migrate to AWQ quantization with FlashAttention-2 3 months ago
  • test_api_gradio_client.py
    7.2 kB
    Implement vLLM with LLM Compressor and performance optimizations 3 months ago