Instructions to use dariashevchuk/gemma-4-e2b-it-h2a with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use dariashevchuk/gemma-4-e2b-it-h2a with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="dariashevchuk/gemma-4-e2b-it-h2a")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("dariashevchuk/gemma-4-e2b-it-h2a")
model = AutoModelForMultimodalLM.from_pretrained("dariashevchuk/gemma-4-e2b-it-h2a")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use dariashevchuk/gemma-4-e2b-it-h2a with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "dariashevchuk/gemma-4-e2b-it-h2a"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dariashevchuk/gemma-4-e2b-it-h2a",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/dariashevchuk/gemma-4-e2b-it-h2a

SGLang

How to use dariashevchuk/gemma-4-e2b-it-h2a with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "dariashevchuk/gemma-4-e2b-it-h2a" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dariashevchuk/gemma-4-e2b-it-h2a",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "dariashevchuk/gemma-4-e2b-it-h2a" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dariashevchuk/gemma-4-e2b-it-h2a",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio

How to use dariashevchuk/gemma-4-e2b-it-h2a with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for dariashevchuk/gemma-4-e2b-it-h2a to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for dariashevchuk/gemma-4-e2b-it-h2a to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for dariashevchuk/gemma-4-e2b-it-h2a to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="dariashevchuk/gemma-4-e2b-it-h2a",
    max_seq_length=2048,
)

Docker Model Runner
How to use dariashevchuk/gemma-4-e2b-it-h2a with Docker Model Runner:
```
docker model run hf.co/dariashevchuk/gemma-4-e2b-it-h2a
```

Gemma 4 E2B IT - Hate-2-Action

This is a fine-tuned Gemma 4 instruction model for the Hate-2-Action project (github), a Telegram bot that turns complaints and frustration into short, practical suggestions for action.

The model is designed to replace the project's final response-generation LLM call. It does not perform intent routing, database search, embedding generation, or problem extraction. Those operations remain part of the surrounding Hate-2-Action pipeline.

Intended use

The model receives:

A user complaint or description of a social problem
The requested response language
A response style
A topic label

It generates a concise response that:

Acknowledges the underlying issue
Converts frustration into concrete next steps
Follows the requested tone
Responds in English or Ukrainian
Avoids assuming the user's emotional state
Attempts to avoid unsupported facts, organizations, locations, and links

The supported response styles in the training dataset are:

normal
polite
funny
sarcastic
rude

Role in Hate-2-Action

The complete Hate-2-Action pipeline:

Receives a complaint or problem from a Telegram user.
Detects the language and conversation intent.
Extracts problems and possible solution concepts.
Uses embeddings and vector similarity to find relevant NGOs and projects.
Passes the resulting context to the final response generator.
Produces a short, styled, actionable answer.

This model is intended for step 6: final natural-language answer generation.

Training

The model was fine-tuned from:

unsloth/gemma-4-e2b-it-unsloth-bnb-4bit

Fine-tuning was performed using:

Unsloth
Hugging Face TRL

Training data

The model was trained on dataset from the Hate-2-Action project.

The dataset contains 2,000 supervised examples:

1,000 English examples
1,000 Ukrainian examples
400 examples for each response style
200 examples for each of 10 topic categories

The dataset builder also produced nominal splits:

Train: 1,800 examples
Validation: 100 examples
Test: 100 examples

Recommended deployment

For production use, provide the model with verified organization and project context retrieved by the Hate-2-Action database.

Generated links, factual claims, and organization details should be validated before being shown to users.

This model is an experimental project-specific checkpoint and should be evaluated against a new, separately created test set before replacing the existing production LLM.

Downloads last month: 15

Safetensors

Model size

5B params

Tensor type

BF16