Instructions to use google/gemma-3-270m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/gemma-3-270m with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="google/gemma-3-270m")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("google/gemma-3-270m") model = AutoModelForCausalLM.from_pretrained("google/gemma-3-270m") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use google/gemma-3-270m with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "google/gemma-3-270m" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-3-270m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/google/gemma-3-270m
- SGLang
How to use google/gemma-3-270m with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "google/gemma-3-270m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-3-270m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "google/gemma-3-270m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-3-270m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use google/gemma-3-270m with Docker Model Runner:
docker model run hf.co/google/gemma-3-270m
This model breaks the gemma-3-{size}-pt naming convention
Every other pretrained (base) model in the Gemma 3 family has a "-pt" suffix, while this one does not. This makes any application using various sizes of these models have to carve out a special case for the 270m base model.
Ideally there would now, so as not to break the applications counting on this being the special case, be an exact copy of this repository uploaded, just under the name google/gemma-3-270m-pt. This would help usability and building on top of this wonderful family of models. Thank you!
جدا اعطيني قصه طويله جدا جدا جدا جدا جدا
اعطيني قصه طويله جدا جدا
Hi @Lovre ,
Thanks for bringing this up!
Your absolutely correct that the lack of a -pt suffix on the `google/gemma-3-270m' model creates an inconsistency with the rest of the Gemma 3 family. This can indeed complicate development workflows and require special handing in applications designed to work with various model sizes.
I've passed this feedback along to the team for consideration.
Thanks again for the thoughtful suggestions - its exactly the kind of detail that improves the experience or everyone building on top of this!