Instructions to use Ravert/gemma-4-E4B-it-testv4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Ravert/gemma-4-E4B-it-testv4 with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Ravert/gemma-4-E4B-it-testv4",
	filename="gemma-4-e4b-it.BF16-mmproj.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use Ravert/gemma-4-E4B-it-testv4 with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Ravert/gemma-4-E4B-it-testv4:BF16
# Run inference directly in the terminal:
llama-cli -hf Ravert/gemma-4-E4B-it-testv4:BF16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Ravert/gemma-4-E4B-it-testv4:BF16
# Run inference directly in the terminal:
llama-cli -hf Ravert/gemma-4-E4B-it-testv4:BF16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Ravert/gemma-4-E4B-it-testv4:BF16
# Run inference directly in the terminal:
./llama-cli -hf Ravert/gemma-4-E4B-it-testv4:BF16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Ravert/gemma-4-E4B-it-testv4:BF16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Ravert/gemma-4-E4B-it-testv4:BF16

Use Docker

docker model run hf.co/Ravert/gemma-4-E4B-it-testv4:BF16

LM Studio
Jan
Ollama
How to use Ravert/gemma-4-E4B-it-testv4 with Ollama:
```
ollama run hf.co/Ravert/gemma-4-E4B-it-testv4:BF16
```

Unsloth Studio

How to use Ravert/gemma-4-E4B-it-testv4 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Ravert/gemma-4-E4B-it-testv4 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Ravert/gemma-4-E4B-it-testv4 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Ravert/gemma-4-E4B-it-testv4 to start chatting

How to use Ravert/gemma-4-E4B-it-testv4 with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Ravert/gemma-4-E4B-it-testv4:BF16

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Ravert/gemma-4-E4B-it-testv4:BF16"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Ravert/gemma-4-E4B-it-testv4 with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Ravert/gemma-4-E4B-it-testv4:BF16

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Ravert/gemma-4-E4B-it-testv4:BF16

Run Hermes

hermes

Docker Model Runner
How to use Ravert/gemma-4-E4B-it-testv4 with Docker Model Runner:
```
docker model run hf.co/Ravert/gemma-4-E4B-it-testv4:BF16
```

Lemonade

How to use Ravert/gemma-4-E4B-it-testv4 with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Ravert/gemma-4-E4B-it-testv4:BF16

Run and chat with the model

lemonade run user.gemma-4-E4B-it-testv4-BF16

List all available models

lemonade list

gemma-4-E4B-it-testv4 / tokenizer_config.json

Ravert

Upload model trained with Unsloth

bd78746 verified 1 day ago

raw

history blame contribute delete

6.86 kB

	{
	"audio_token": "<\|audio\|>",
	"backend": "tokenizers",
	"boa_token": "<\|audio>",
	"boi_token": "<\|image>",
	"bos_token": "<bos>",
	"eoa_token": "<audio\|>",
	"eoc_token": "<channel\|>",
	"eoi_token": "<image\|>",
	"eos_token": "<eos>",
	"eot_token": "<turn\|>",
	"escape_token": "<\|\"\|>",
	"etc_token": "<tool_call\|>",
	"etd_token": "<tool\|>",
	"etr_token": "<tool_response\|>",
	"extra_special_tokens": [
	"<\|video\|>"
	],
	"image_token": "<\|image\|>",
	"is_local": false,
	"mask_token": "<mask>",
	"model_max_length": 131072,
	"model_specific_special_tokens": {
	"audio_token": "<\|audio\|>",
	"boa_token": "<\|audio>",
	"boi_token": "<\|image>",
	"eoa_token": "<audio\|>",
	"eoc_token": "<channel\|>",
	"eoi_token": "<image\|>",
	"eot_token": "<turn\|>",
	"escape_token": "<\|\"\|>",
	"etc_token": "<tool_call\|>",
	"etd_token": "<tool\|>",
	"etr_token": "<tool_response\|>",
	"image_token": "<\|image\|>",
	"soc_token": "<\|channel>",
	"sot_token": "<\|turn>",
	"stc_token": "<\|tool_call>",
	"std_token": "<\|tool>",
	"str_token": "<\|tool_response>",
	"think_token": "<\|think\|>"
	},
	"pad_token": "<pad>",
	"padding_side": "left",
	"processor_class": "Gemma4Processor",
	"response_schema": {
	"properties": {
	"content": {
	"type": "string"
	},
	"role": {
	"const": "assistant"
	},
	"thinking": {
	"type": "string"
	},
	"tool_calls": {
	"items": {
	"properties": {
	"function": {
	"properties": {
	"arguments": {
	"additionalProperties": {},
	"type": "object",
	"x-parser": "gemma4-tool-call"
	},
	"name": {
	"type": "string"
	}
	},
	"type": "object",
	"x-regex": "call\\:(?P<name>\\w+)(?P<arguments>\\{.*\\})"
	},
	"type": {
	"const": "function"
	}
	},
	"type": "object"
	},
	"type": "array",
	"x-regex-iterator": "<\\\|tool_call>(.*?)<tool_call\\\|>"
	}
	},
	"type": "object",
	"x-regex": "(\\<\\\|channel\\>thought\\n(?P<thinking>.?)\\<channel\\\|\\>)?(?P<tool_calls>\\<\\\|tool_call\\>.\\<tool_call\\\|\\>)?(?P<content>(?:(?!\\<turn\\\|\\>)(?!\\<\\\|tool_response\\>).)+)?(?:\\<turn\\\|\\>\|\\<\\\|tool_response\\>)?"
	},
	"soc_token": "<\|channel>",
	"sot_token": "<\|turn>",
	"stc_token": "<\|tool_call>",
	"std_token": "<\|tool>",
	"str_token": "<\|tool_response>",
	"think_token": "<\|think\|>",
	"tokenizer_class": "GemmaTokenizer",
	"unk_token": "<unk>",
	"added_tokens_decoder": {
	"0": {
	"content": "<pad>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"1": {
	"content": "<eos>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"2": {
	"content": "<bos>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"3": {
	"content": "<unk>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"4": {
	"content": "<mask>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"46": {
	"content": "<\|tool>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"47": {
	"content": "<tool\|>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"48": {
	"content": "<\|tool_call>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"49": {
	"content": "<tool_call\|>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"50": {
	"content": "<\|tool_response>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"51": {
	"content": "<tool_response\|>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"52": {
	"content": "<\|\"\|>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"98": {
	"content": "<\|think\|>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"100": {
	"content": "<\|channel>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"101": {
	"content": "<channel\|>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"105": {
	"content": "<\|turn>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"106": {
	"content": "<turn\|>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"255999": {
	"content": "<\|image>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"256000": {
	"content": "<\|audio>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"258880": {
	"content": "<\|image\|>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"258881": {
	"content": "<\|audio\|>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"258882": {
	"content": "<image\|>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"258883": {
	"content": "<audio\|>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	},
	"258884": {
	"content": "<\|video\|>",
	"single_word": false,
	"lstrip": false,
	"rstrip": false,
	"normalized": false,
	"special": true
	}
	}
	}