Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
seyf1elislamΒ 
posted an update Aug 8, 2025
Post
372

πŸš€ Run <14B, 12B, 8B… LLMs **for FREE** on Google Colab (15GB VRAM GPU)

πŸ”— Repo: https://github.com/seyf1elislam/LocalLLM_OneClick_Colab

πŸ“Œ How to Use

1. Open the notebook β†’ Click β€œOpen in Colab” and enable GPU mode.
2. Enter model details β†’ Provide the Hugging Face repo name & quantization type.

* Example: unsloth/Qwen3-8B-GGUF with quant Q5_k_m
3. Run all cells β†’ Wait 1–3 minutes. You'll get a link to the GUI & API (OpenAI-compatible).

πŸ’‘ Yes, it’s really free. Enjoy! ✨

---

πŸ“ Supported Models (examples)

* Qwen3 14B** β†’ Q5_k_m, Q4_k_m
* Qwen3 8B** β†’ Q8_0
* Nemo 12B β†’ Q6_k, Q5_k_m
* Gemma3 12B β†’ Q6_k, Q5_k_m

---

πŸ’» Available Notebooks

1. KoboldCpp(⭐⭐⭐ Recommended – faster setup & inference)
πŸ”— https://github.com/seyf1elislam/LocalLLM_OneClick_Colab/blob/main/awesome_koboldcpp_notebook.ipynb
2. TextGen-WebUI(⭐⭐ Recommended)
πŸ”— https://github.com/seyf1elislam/LocalLLM_OneClick_Colab/blob/main/Run_any_gguf_model_in_TextGen_webui.ipynb
In this post