@seyf1elislam on Hugging Face: " 🚀 Run <14B, 12B, 8B… LLMs **for FREE** on Google Colab (15GB VRAM GPU) 🔗…"

Post

448

🚀 Run <14B, 12B, 8B… LLMs **for FREE** on Google Colab (15GB VRAM GPU)

🔗 Repo: https://github.com/seyf1elislam/LocalLLM_OneClick_Colab

📌 How to Use

1. Open the notebook → Click “Open in Colab” and enable GPU mode.
2. Enter model details → Provide the Hugging Face repo name & quantization type.

* Example: unsloth/Qwen3-8B-GGUF with quant Q5_k_m
3. Run all cells → Wait 1–3 minutes. You'll get a link to the GUI & API (OpenAI-compatible).

💡 Yes, it’s really free. Enjoy! ✨

---

📝 Supported Models (examples)

* Qwen3 14B** → Q5_k_m, Q4_k_m
* Qwen3 8B** → Q8_0
* Nemo 12B → Q6_k, Q5_k_m
* Gemma3 12B → Q6_k, Q5_k_m

---

💻 Available Notebooks

1. KoboldCpp(⭐⭐⭐ Recommended – faster setup & inference)
🔗 https://github.com/seyf1elislam/LocalLLM_OneClick_Colab/blob/main/awesome_koboldcpp_notebook.ipynb
2. TextGen-WebUI(⭐⭐ Recommended)
🔗 https://github.com/seyf1elislam/LocalLLM_OneClick_Colab/blob/main/Run_any_gguf_model_in_TextGen_webui.ipynb

Join the conversation