Instructions to use google/gemma-3-270m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/gemma-3-270m with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="google/gemma-3-270m")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("google/gemma-3-270m") model = AutoModelForCausalLM.from_pretrained("google/gemma-3-270m") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use google/gemma-3-270m with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "google/gemma-3-270m" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-3-270m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/google/gemma-3-270m
- SGLang
How to use google/gemma-3-270m with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "google/gemma-3-270m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-3-270m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "google/gemma-3-270m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-3-270m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use google/gemma-3-270m with Docker Model Runner:
docker model run hf.co/google/gemma-3-270m
Some weights of Gemma3TextModel were not initialized from the model checkpoint at google/gemma-3-270m-it
I get a warning about the model weights not being loaded:
from transformers import AutoModel
AutoModel.from_pretrained("google/gemma-3-270m")
The AutoModel is mapped to a Gemma3TextModel but the safe tensors in this repository have a "model." prefix in each tensor; so loading it via Gemma3ForCausalLM works:
from transformers import Gemma3ForCausalLM
Gemma3ForCausalLM.from_pretrained("google/gemma-3-270m")
It seems like the lm_head weights are missing in the safe tensor however
Hi @ShukantP ,
Welcome to Google's Gemma models, thanks for reaching out to us.
This behavior you're observing is a common consequence of how Hugging Face's transformers library handles model architecture mapping and state dictionary key matching for different model types.
When you call AutoModel.from_pretrained("google/gemma-3-270m"), the AutoModel class inspects the configuration of the remote repository and correctly determines that the base architecture is the Gemma3TextModel. The model loaded by AutoModel (which maps to Gemma3TextModel) expects keys for the text model only. Since your weights are prefixed with model., they don't exactly match the keys expected by the raw Gemma3TextModel instance, leading to the warning about weights not being loaded.
Gemma3ForCausalLM.from_pretrained("google/gemma-3-270m") , you are explicitly loading the Causal Language Model head architecture.
AutoModel --> Gemma3TextModel --> Keys for base model (e.g., embed_tokens.weight) --> The weights in the file have a model. prefix which doesn't match the keys of the raw Gemma3TextModel instance.
Gemma3ForCausalLM --> Gemma3ForCausalLM --> Keys for model. and lm_head. --> The Gemma3ForCausalLM class has a submodule named model, making the checkpoint keys match.
Thanks.