Instructions to use prithivMLmods/GWQ-9B-Preview2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use prithivMLmods/GWQ-9B-Preview2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="prithivMLmods/GWQ-9B-Preview2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/GWQ-9B-Preview2")
model = AutoModelForCausalLM.from_pretrained("prithivMLmods/GWQ-9B-Preview2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use prithivMLmods/GWQ-9B-Preview2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "prithivMLmods/GWQ-9B-Preview2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/GWQ-9B-Preview2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/prithivMLmods/GWQ-9B-Preview2

SGLang

How to use prithivMLmods/GWQ-9B-Preview2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "prithivMLmods/GWQ-9B-Preview2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/GWQ-9B-Preview2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "prithivMLmods/GWQ-9B-Preview2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/GWQ-9B-Preview2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use prithivMLmods/GWQ-9B-Preview2 with Docker Model Runner:
```
docker model run hf.co/prithivMLmods/GWQ-9B-Preview2
```

License Compatibility

by qiuqiu666 - opened Jun 24, 2025

Discussion

qiuqiu666

Jun 24, 2025

Hi , I’d like to raise a potential license incompatibility issue regarding
prithivMLmods/GWQ-9B-Preview2.

It appears this model is a fine-tuned version of
prithivMLmods/GWQ-9B-Preview, which is licensed under the Gemma License from Google.

However, the newer model GWQ-9B-Preview2 is currently published under the
CreativeML OpenRAIL-M License, which may not be compatible with the original Gemma license due to conflicting distribution and usage restrictions.

⚠️ License Conflicts:

Gemma License (GWQ-9B-Preview):
• Use is restricted to non-commercial research only
• Redistribution is only allowed under identical terms
• Sublicensing or relicensing is not permitted
• Must include Google's Acceptable Use Policy (AUP)

CreativeML OpenRAIL-M License (GWQ-9B-Preview2):
• Permits redistribution, reuse, and modification with fewer constraints
• May allow commercial usage unless explicitly restricted
• No requirement to pass down Google’s AUP or attribution

Conflict:
→ The Gemma license prohibits re-licensing under terms that do not preserve its own.
→ By applying the CreativeML OpenRAIL-M license to a Gemma-derived model, `GWQ-9B-Preview2` removes critical legal restrictions required by Google — including limits on commercial use and license propagation.

By omitting the Gemma license, this derivative may unintentionally remove critical restrictions on redistribution and commercial usage required by Google — potentially placing downstream users in legal uncertainty.

🔹 Suggestions for Resolving

1. Clearly state that the model is derived from GWQ-9B-Preview (Gemma license)
2. Replace or supplement the current license with the original Gemma License
3. Add a NOTICE file that includes:
   • Attribution to Google
   • Full text or link to the Gemma license
   • A reminder of non-commercial use restrictions
4. Remove CreativeML OpenRAIL-M if it introduces broader rights than allowed under Gemma
5. Optionally, seek clarification from Google if redistribution under a different license is intended

Let me know if I misunderstood anything — happy to help clarify further!

Thanks for your attention!

qiuqiu666

Jul 10, 2025

Looking forward to your response!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment