PEFT
Safetensors
GGUF
gemma-4
lora
reasoning
opus
claude-4.6
distillation
unsloth
Mixture of Experts
Instructions to use hotdogs/gemma4-26b-opus-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use hotdogs/gemma4-26b-opus-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/gemma-4-26b-a4b-it") model = PeftModel.from_pretrained(base_model, "hotdogs/gemma4-26b-opus-lora") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio
How to use hotdogs/gemma4-26b-opus-lora with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for hotdogs/gemma4-26b-opus-lora to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for hotdogs/gemma4-26b-opus-lora to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for hotdogs/gemma4-26b-opus-lora to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="hotdogs/gemma4-26b-opus-lora", max_seq_length=2048, )
Gemma 4 26B A4B (MoE) — Claude 4.6 Opus Reasoning LoRA
🧠 PEFT LoRA adapter (ไม่ใช่ full model — ต้องใช้คู่กับ base model )
LoRA ฝึกด้วย Unsloth SFT จาก Borcherding/Gemma4-26B-A4B-Claude-4.6-Opus-Reasoning-Distilled — rank=8, alpha=8, target attention+MLP, 944 MB
⚠️ โมเดลนี้เป็น Mixture-of-Experts (26B total, 4B active) — ใช้ VRAM น้อยกว่า Dense 31B ⚠️ GGUF version ถอดเฉพาะ attention+MLP (ไม่มี expert tensors) — 18 MB
📦 สิ่งที่อยู่ใน Repo นี้
| ไฟล์ | คำอธิบาย |
|---|---|
| PEFT LoRA weights (ใช้กับ transformers/peft) | |
| LoRA config (rank=8, alpha=8) | |
| GGUF format (attention+MLP only, ไม่รวม experts) |
🚀 Quick Start
PEFT (transformers)
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# 1. โหลด base model (MoE — 4B active, ประหยัด VRAM)
base_model = AutoModelForCausalLM.from_pretrained(
"google/gemma-4-26B-A4B-it",
torch_dtype=torch.bfloat16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-26B-A4B-it")
# 2. โหลด LoRA adapter
model = PeftModel.from_pretrained(base_model, "hotdogs/gemma4-26b-opus-lora")
# 3. ใช้งาน
messages = [{"role": "user", "content": "Solve this step by step: 3x + 7 = 22"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
llama.cpp (GGUF)
# GGUF เป็น attention+MLP only (18 MB)
./llama-server \
-m gemma-4-26B-A4B-it-Q4_K_M.gguf \
--lora gguf/adapter_model.gguf \
--lora-scaled gguf/adapter_model.gguf:1.0 \
--host 0.0.0.0 --port 8080 \
--ctx-size 8192 -fa --jinja
Ollama Modelfile
FROM gemma4:26b
ADAPTER ./gguf/adapter_model.gguf
PARAMETER temperature 0.7
SYSTEM "You are a thoughtful AI that reasons step by step."
📊 Adapter Details
| Parameter | Value |
|---|---|
| Base Model | |
| Source | |
| Training | Unsloth SFT |
| Rank | 8 |
| Alpha | 8 |
| Target Modules | Attention + MLP + Experts (full LoRA) |
| PEFT Size | 944 MB |
| GGUF Size | 18 MB (attn+MLP only) |
⚠️ GGUF note: expert tensors (120 tensors, MoE-specific) ถอดออกเพราะ llama.cpp ยังไม่รองรับ — เหลือเฉพาะ attention + MLP (410 tensors)
🙏 Credits
- Training: Borcherding — Claude 4.6 Opus Reasoning Distilled
- GGUF Conversion & Curation: UKA (Hermes Agent, Nous Research)
- Base Model: Google / Unsloth — gemma-4-26b-a4b-it
📜 License
Apache 2.0
- Downloads last month
- 86
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support