Instructions to use NotoriousH2/gemma-3-12b-it-TextOnly with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use NotoriousH2/gemma-3-12b-it-TextOnly with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="NotoriousH2/gemma-3-12b-it-TextOnly")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("NotoriousH2/gemma-3-12b-it-TextOnly")
model = AutoModelForCausalLM.from_pretrained("NotoriousH2/gemma-3-12b-it-TextOnly")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use NotoriousH2/gemma-3-12b-it-TextOnly with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "NotoriousH2/gemma-3-12b-it-TextOnly"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NotoriousH2/gemma-3-12b-it-TextOnly",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/NotoriousH2/gemma-3-12b-it-TextOnly

SGLang

How to use NotoriousH2/gemma-3-12b-it-TextOnly with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "NotoriousH2/gemma-3-12b-it-TextOnly" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NotoriousH2/gemma-3-12b-it-TextOnly",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "NotoriousH2/gemma-3-12b-it-TextOnly" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NotoriousH2/gemma-3-12b-it-TextOnly",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use NotoriousH2/gemma-3-12b-it-TextOnly with Docker Model Runner:
```
docker model run hf.co/NotoriousH2/gemma-3-12b-it-TextOnly
```

NotoriousH2 commited on Jul 13, 2025

Commit

efbf437

verified ·

1 Parent(s): 7d9a0a3

Upload Gemma3ForCausalLM

Browse files

Files changed (8) hide show

config.json +2 -2
generation_config.json +13 -0
model-00001-of-00005.safetensors +2 -2
model-00002-of-00005.safetensors +2 -2
model-00003-of-00005.safetensors +2 -2
model-00004-of-00005.safetensors +2 -2
model-00005-of-00005.safetensors +2 -2
model.safetensors.index.json +0 -0

config.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "architectures": [
-    "Gemma3TextModel"
   ],
   "attention_bias": false,
   "attention_dropout": 0.0,
@@ -31,7 +31,7 @@
   "sliding_window": 1024,
   "sliding_window_pattern": 6,
   "torch_dtype": "bfloat16",
-  "transformers_version": "4.52.4",
   "use_cache": true,
   "vocab_size": 262208
 }

 {
   "architectures": [
+    "Gemma3ForCausalLM"
   ],
   "attention_bias": false,
   "attention_dropout": 0.0,
   "sliding_window": 1024,
   "sliding_window_pattern": 6,
   "torch_dtype": "bfloat16",
+  "transformers_version": "4.51.2",
   "use_cache": true,
   "vocab_size": 262208
 }

generation_config.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "bos_token_id": 2,
+  "cache_implementation": "hybrid",
+  "do_sample": true,
+  "eos_token_id": [
+    1,
+    106
+  ],
+  "pad_token_id": 0,
+  "top_k": 64,
+  "top_p": 0.95,
+  "transformers_version": "4.51.2"
+}

model-00001-of-00005.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:58c7fcd1123da37d9642cfc640febada31dde995588a01da0980d9f8efd1a4f1
-size 4915892480

 version https://git-lfs.github.com/spec/v1
+oid sha256:1a4c14f38ffc9e23d873fa999eea1fc965dcedcec267e7a99f07c6fdbe8760d8
+size 4915892992

model-00002-of-00005.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:191b845a7088b620a10e73fe8233d843959218e89154465615b98bbec24f034b
-size 4931293608

 version https://git-lfs.github.com/spec/v1
+oid sha256:0335ce336b31e38c581326acb77ff28330c2073764b1924ad7108d788b56ba68
+size 4931294472

model-00003-of-00005.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d674d1ffd75177135be9cba5b0a24403ba98d7208e8fa2fdeefd6e56864bba17
-size 4931293664

 version https://git-lfs.github.com/spec/v1
+oid sha256:e0d6a60cf6a1708be0613ab7c4d10457ad99b98c02e40ebd5f320fe02aa0d50c
+size 4931294528

model-00004-of-00005.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3c62e7b3e9b2d32284bfa84cb26d85a18166abc144e44bdeeb620df8cdadb13d
-size 4931293664

 version https://git-lfs.github.com/spec/v1
+oid sha256:b8b76dd5c3b1bb971354de059db455aac1c59fea624bf8a0289e37557ff43261
+size 4931294528

model-00005-of-00005.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3dfebc2cd5e0fa6fc45be1b55e75d006a5aa125836c7912772f0ee8d9d4fe2ac
-size 3822364144

 version https://git-lfs.github.com/spec/v1
+oid sha256:6d9c549d5bc84e50ecc0c62c8bf5edebd5036ac83bceabccf75b83689f58f370
+size 3822364808

model.safetensors.index.json CHANGED Viewed

The diff for this file is too large to render. See raw diff