How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf ramixpe/gemma-xr:Q8_0
# Run inference directly in the terminal:
llama-cli -hf ramixpe/gemma-xr:Q8_0
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf ramixpe/gemma-xr:Q8_0
# Run inference directly in the terminal:
llama-cli -hf ramixpe/gemma-xr:Q8_0
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf ramixpe/gemma-xr:Q8_0
# Run inference directly in the terminal:
./llama-cli -hf ramixpe/gemma-xr:Q8_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf ramixpe/gemma-xr:Q8_0
# Run inference directly in the terminal:
./build/bin/llama-cli -hf ramixpe/gemma-xr:Q8_0
Use Docker
docker model run hf.co/ramixpe/gemma-xr:Q8_0
Quick Links

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Gemma-XR: IOS-XR Expert (Fine-tuned Gemma 4 31B)

Fine-tuned Google Gemma 4 31B-it for Cisco IOS-XR service provider networking.

Score: 47/50 (94%) on 50-prompt evaluation

Bucket Score
Contamination 10/10 (100%)
Hierarchy 10/10 (100%)
Fabrication 8/8 (100%)
Verify 7/7 (100%)
Repair 7/8 (88%)
Clarify 5/7 (71%)

Quick Start (Ollama)

Training Details

  • Base: google/gemma-4-31B-it
  • Method: LoRA r=32, alpha=32, Unsloth
  • LR: 5e-5 (gentle surgical adaptation)
  • Epochs: 2
  • Dataset: 1,133 records (70% broad IOS-XR QA + 30% structured config/repair tasks)
  • Training time: 28 minutes on A100 80GB
  • GGUF: Q8_0 quantization (31GB)
Downloads last month
3
GGUF
Model size
31B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support