Spaces:

liushuaiqian
/

test_bushu

Runtime error

App Files Files Community

Apply for community grant: Personal project (gpu)

by liushuaiqian - opened Jul 7, 2025

Discussion

liushuaiqian

Owner Jul 7, 2025

Subject: GPU Request for Deploying 7B Instruction-Tuned LLM (Gemma-7B-IT + LoRA)

Hello Hugging Face Team,

I am requesting access to a T4 GPU for my Space “test_bushu” (https://huggingface.co/spaces/liushuaiqian/test_bushu) in order to deploy and serve a fine‑tuned 7B language model based on Gemma-7B-IT, using PEFT + LoRA techniques.

Background:

The model has been fine-tuned on custom Chinese instruction-response data using LoRA (r=16, α=32, dropout=0.05).
Total model size (quantized to 4-bit) is around 7–8 GB, but practical inference requires ~12–16 GB of memory (including full KV cache), which exceeds the current 16 GB RAM on the CPU-only instance.
On CPU-only configuration, the Space cannot be deployed or runs out-of-memory immediately.

Purpose:

Support real-time Chinese instruction-tuned interaction using Gradio interface.
Enable users to test and explore LLM capabilities in Chinese.
Provide educational and research value, demonstrating lightweight fine-tuning techniques (LoRA + quantization).

I believe this use case aligns with Hugging Face’s mission to democratize access to state-of-the-art models and empower multilingual AI applications. I appreciate your consideration and await your approval.

Thank you very much!

Best regards,
[Your Name or HF username: liushuaiqian]

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment