Instructions to use Changgil/google-gemma-3-27b-it-text with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Changgil/google-gemma-3-27b-it-text with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Changgil/google-gemma-3-27b-it-text") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Changgil/google-gemma-3-27b-it-text") model = AutoModelForCausalLM.from_pretrained("Changgil/google-gemma-3-27b-it-text") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Changgil/google-gemma-3-27b-it-text with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Changgil/google-gemma-3-27b-it-text" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Changgil/google-gemma-3-27b-it-text", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Changgil/google-gemma-3-27b-it-text
- SGLang
How to use Changgil/google-gemma-3-27b-it-text with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Changgil/google-gemma-3-27b-it-text" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Changgil/google-gemma-3-27b-it-text", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Changgil/google-gemma-3-27b-it-text" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Changgil/google-gemma-3-27b-it-text", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Changgil/google-gemma-3-27b-it-text with Docker Model Runner:
docker model run hf.co/Changgil/google-gemma-3-27b-it-text
Gemma 3 Text-Only Model Card
Model Information
Original Model: Gemma 3 by Google DeepMind
Adaptation: Text-only version (Image processing capabilities removed)
Description
This is a text-only version of the Gemma 3 model, adapted from Google's original multimodal Gemma 3. The image processing capabilities have been removed while preserving the text generation capabilities.
This text-only adaptation maintains the core language capabilities with a 128K context window and multilingual support in over 140 languages. The model is well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning.
The adaptation makes the model more lightweight and suitable for environments where only text processing is needed, or where resource constraints make the full multimodal model impractical.
Inputs and outputs
- Input:
- Text string, such as a question, a prompt, or a document to be summarized
- Total input context of 128K tokens for the 27B size
- Output:
- Generated text in response to the input, such as an answer to a question or a summary of a document
- Total output context of 8192 tokens
Adaptation Details
This adaptation:
- Removes the image processing components from the model
- Maintains the same text tokenization and generation capabilities
- Is compatible with standard text-only inference pipelines
- Can be used with regular
AutoModelForCausalLMinstead of requiring specialized multimodal classes
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "your-username/gemma-3-27b-text" # Replace with your model path after uploading
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
messages = [
{"role": "system", "content": "You are an AI assistant that provides helpful and accurate information."},
{"role": "user", "content": "Hello. How's the weather today?"}
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
)
outputs = model.generate(
inputs,
max_new_tokens=512,
temperature=0.2,
do_sample=True
)
response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(response)
- Downloads last month
- 11
Model tree for Changgil/google-gemma-3-27b-it-text
Base model
google/gemma-3-27b-pt