Instructions to use google/shieldgemma-2-4b-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use google/shieldgemma-2-4b-it with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="google/shieldgemma-2-4b-it")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModelForImageClassification
model = AutoModelForImageClassification.from_pretrained("google/shieldgemma-2-4b-it", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use google/shieldgemma-2-4b-it with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "google/shieldgemma-2-4b-it"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/shieldgemma-2-4b-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/google/shieldgemma-2-4b-it

SGLang

How to use google/shieldgemma-2-4b-it with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "google/shieldgemma-2-4b-it" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/shieldgemma-2-4b-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "google/shieldgemma-2-4b-it" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/shieldgemma-2-4b-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use google/shieldgemma-2-4b-it with Docker Model Runner:
```
docker model run hf.co/google/shieldgemma-2-4b-it
```

Unexpected Warning When Loading `google/shieldgemma-2-4b-it` & Low Accuracy on Custom Dataset

by Haulyn5 - opened Sep 22, 2025

Discussion

Haulyn5

Sep 22, 2025

Hello,

I recently attempted to utilize the google/shieldgemma-2-4b-it model, strictly following the example code provided in the model’s README (with only token param added). However, I encountered an unexpected warning during the loading process:

Some weights of ShieldGemma2ForImageClassification were not initialized from the model checkpoint at google/shieldgemma-2-4b-it and are newly initialized: ['model.lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Additionally, when evaluating the model on my custom dataset, its accuracy only reached approximately **50%**—a result far below my expectations. Given the warning above, I suspect this performance discrepancy may be linked to the uninitialized weights mentioned.

I would greatly appreciate clarification on the following questions:

Is this warning expected behavior when loading google/shieldgemma-2-4b-it?
Could this uninitialized weight warning be the root cause of the unexpectedly low accuracy on my dataset?
What steps are recommended to debug or verify whether the model weights have been correctly initialized?

Environment Details:

Python version: 3.11.2
PyTorch version: 2.8.0+cu128
Transformers version: 4.56.2
OS: Linux 5.4.143-amd64
GPU: Tesla V100-SXM2-32GB

Thanks in advance for your time and assistance!

CC: @merve , @BalakrishnaCh , @Renu11 , @RyanMullins

BalakrishnaCh

Google org Sep 26, 2025

Hi @Haulyn5 ,

Thanks for reaching out to us, yes, this warning is generally expected behavior when loading a language model checkpoint into a task-specific head, but it is critical to understand which weights are affected. ShieldGemma2ForImageClassification.from_pretrained(), the Hugging Face library is loading the original pre-trained model weights but then attempting to map them into a model class that includes a specific Image Classification Head. The weight flagged, ['model.lm_head.weight'], is the Language Model (LM) head from the original base Gemma architecture.

The ShieldGemma models are instruction-tuned for a specific safety evaluation task, which involves giving it an image and a policy text and having it output a "Yes" or "No" token.
To correctly use the model for Image Classification on your custom dataset, you need to perform Fine-tuning. The model is giving you that exact advice, you should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Recommended Action :
The definitive fix for this problem is to fine-tune the model on your custom dataset. This process will train the randomly initialized lm_head (or the actual classification head being used) to learn the mapping from the model's internal features to your specific output labels, leveraging the powerful pre-trained Gemma backbone.

Thanks.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment