Apply for community grant: Personal project (gpu)

#2
by liushuaiqian - opened

Subject: GPU Request for Deploying 7B Instruction-Tuned LLM (Gemma-7B-IT + LoRA)

Hello Hugging Face Team,

I am requesting access to a T4 GPU for my Space “test_bushu” (https://huggingface.co/spaces/liushuaiqian/test_bushu) in order to deploy and serve a fine‑tuned 7B language model based on Gemma-7B-IT, using PEFT + LoRA techniques.

Background:

  • The model has been fine-tuned on custom Chinese instruction-response data using LoRA (r=16, α=32, dropout=0.05).
  • Total model size (quantized to 4-bit) is around 7–8 GB, but practical inference requires ~12–16 GB of memory (including full KV cache), which exceeds the current 16 GB RAM on the CPU-only instance.
  • On CPU-only configuration, the Space cannot be deployed or runs out-of-memory immediately.

Purpose:

  • Support real-time Chinese instruction-tuned interaction using Gradio interface.
  • Enable users to test and explore LLM capabilities in Chinese.
  • Provide educational and research value, demonstrating lightweight fine-tuning techniques (LoRA + quantization).

I believe this use case aligns with Hugging Face’s mission to democratize access to state-of-the-art models and empower multilingual AI applications. I appreciate your consideration and await your approval.

Thank you very much!

Best regards,
[Your Name or HF username: liushuaiqian]

Sign up or log in to comment