Instructions to use google/gemma-3-270m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/gemma-3-270m with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="google/gemma-3-270m")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("google/gemma-3-270m") model = AutoModelForCausalLM.from_pretrained("google/gemma-3-270m") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use google/gemma-3-270m with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "google/gemma-3-270m" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-3-270m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/google/gemma-3-270m
- SGLang
How to use google/gemma-3-270m with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "google/gemma-3-270m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-3-270m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "google/gemma-3-270m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-3-270m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use google/gemma-3-270m with Docker Model Runner:
docker model run hf.co/google/gemma-3-270m
Upload 7 files
🧹 Gemma-3-270M Machine Unlearning Project
This project demonstrates machine unlearning using Google’s Gemma-3-270M.
The aim was to fine-tune the model so that it forgets specific tokens, phrases, or patterns without affecting the rest of its knowledge.
📌 What I Did
Downloaded the base Gemma-3-270M model locally from Hugging Face.
Created a scrubbing dataset with words and phrases that should be forgotten by the model.
Implemented an unlearning script that loads the model and tokenizer, applies a forgetting mechanism to the scrubbed dataset, fine-tunes the model, and saves the updated version into a new folder.
Ran the fine-tuning loop so that the model unlearned the targeted content while keeping its general language understanding intact.
📂 Project Structure
Base Model Files: Configuration, weights, and tokenizer files.
Scrub Dataset: Custom file with words or phrases to be forgotten.
Unlearning Script: Python script used to train the model with unlearning objectives.
Fine-Tuned Model Folder: Contains the updated model files after unlearning.
▶️ Workflow
Install the necessary dependencies.
Run the unlearning script.
A new fine-tuned model folder is created that contains the updated model.
Load the new model for testing and verify that it avoids the scrubbed tokens while retaining normal abilities.
✅ Output
Before unlearning: The model could generate definitions or explanations for the scrubbed words and phrases.
After unlearning: The model avoids producing the scrubbed tokens, instead giving responses such as “I don’t know,” placeholder tokens, or irrelevant outputs.
The final model continues to function normally for all other queries while no longer producing the forgotten content.