Instructions to use Changgil/google-gemma-3-27b-it-text with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Changgil/google-gemma-3-27b-it-text with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Changgil/google-gemma-3-27b-it-text")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Changgil/google-gemma-3-27b-it-text")
model = AutoModelForCausalLM.from_pretrained("Changgil/google-gemma-3-27b-it-text")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Changgil/google-gemma-3-27b-it-text with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Changgil/google-gemma-3-27b-it-text"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Changgil/google-gemma-3-27b-it-text",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Changgil/google-gemma-3-27b-it-text

SGLang

How to use Changgil/google-gemma-3-27b-it-text with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Changgil/google-gemma-3-27b-it-text" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Changgil/google-gemma-3-27b-it-text",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Changgil/google-gemma-3-27b-it-text" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Changgil/google-gemma-3-27b-it-text",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Changgil/google-gemma-3-27b-it-text with Docker Model Runner:
```
docker model run hf.co/Changgil/google-gemma-3-27b-it-text
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Gemma 3 Text-Only Model Card

Model Information

Original Model: Gemma 3 by Google DeepMind

Adaptation: Text-only version (Image processing capabilities removed)

Description

This is a text-only version of the Gemma 3 model, adapted from Google's original multimodal Gemma 3. The image processing capabilities have been removed while preserving the text generation capabilities.

This text-only adaptation maintains the core language capabilities with a 128K context window and multilingual support in over 140 languages. The model is well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning.

The adaptation makes the model more lightweight and suitable for environments where only text processing is needed, or where resource constraints make the full multimodal model impractical.

Inputs and outputs

Input:
- Text string, such as a question, a prompt, or a document to be summarized
- Total input context of 128K tokens for the 27B size
Output:
- Generated text in response to the input, such as an answer to a question or a summary of a document
- Total output context of 8192 tokens

Adaptation Details

This adaptation:

Removes the image processing components from the model
Maintains the same text tokenization and generation capabilities
Is compatible with standard text-only inference pipelines
Can be used with regular AutoModelForCausalLM instead of requiring specialized multimodal classes

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "your-username/gemma-3-27b-text" # Replace with your model path after uploading
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

messages = [
    {"role": "system", "content": "You are an AI assistant that provides helpful and accurate information."},
    {"role": "user", "content": "Hello. How's the weather today?"}
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
)

outputs = model.generate(
    inputs,
    max_new_tokens=512,
    temperature=0.2,
    do_sample=True
)

response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(response)

Downloads last month: 11

Safetensors

Model size

27B params

Tensor type

BF16

Model tree for Changgil/google-gemma-3-27b-it-text

Base model

google/gemma-3-27b-pt

Finetuned

google/gemma-3-27b-it

Finetuned

(455)

this model

Adapters

1 model

Quantizations

2 models