Instructions to use google/gemma-3-1b-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use google/gemma-3-1b-it with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="google/gemma-3-1b-it")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("google/gemma-3-1b-it")
model = AutoModelForCausalLM.from_pretrained("google/gemma-3-1b-it")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use google/gemma-3-1b-it with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "google/gemma-3-1b-it"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-3-1b-it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/google/gemma-3-1b-it

SGLang

How to use google/gemma-3-1b-it with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "google/gemma-3-1b-it" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-3-1b-it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "google/gemma-3-1b-it" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-3-1b-it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use google/gemma-3-1b-it with Docker Model Runner:
```
docker model run hf.co/google/gemma-3-1b-it
```

What transformers version can this be deployed with?

by Khalizo - opened Mar 12, 2025

Discussion

Khalizo

Mar 12, 2025

I tried deploying this model on AWS SageMaker however it seems like the transformers library doesn't have an update to date version to yet to handle gemma 3.

How can this be deployed?

Renu11

Google org Mar 12, 2025

To use Gemma-3 models, you need the latest development version of the Transformers library (4.50.0.dev0). You can install it directly from the GitHub branch using: pip install git+https://github.com/huggingface/transformers@v4.49.0-Gemma-3 as mentioned here.

Khalizo

Mar 13, 2025

Ok thanks. The issue that I am having though is that in order to deploy on sagemaker, I have to put which transformers version and there hasn't been a new release with Gemma 3 yet. I also tried extending a Deep learning container by installing the (4.50.0.dev0) but ran into some compatibility issues.

What would be the easiest way for me to deploy this on sagemaker?

Renu11

Google org Mar 27, 2025

A new stable version of Transformers is now available which is compatible to Gemma3. Please update it using pip install -U transformers and try again. Let us know if this helps! Thank you

Rev95

Apr 4, 2025

I had similar issue and even the newer version didnt help. It gave me the below error.

Traceback (most recent call last): File "/usr/local/bin/dockerd-entrypoint.py", line 21, in from sagemaker_huggingface_inference_toolkit import serving File "/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/serving.py", line 18, in from sagemaker_huggingface_inference_toolkit import handler_service, mms_model_server File "/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 28, in from sagemaker_huggingface_inference_toolkit.transformers_utils import ( File "/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/transformers_utils.py", line 24, in from transformers.pipelines import Conversation, Pipeline

2025-04-04T19:40:20.455Z

ImportError: cannot import name 'Conversation' from 'transformers.pipelines' (/opt/conda/lib/python3.10/site-packages/transformers/pipelines/init.py)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment