Instructions to use Qwen/Qwen2-7B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Qwen/Qwen2-7B-Instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Qwen/Qwen2-7B-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-7B-Instruct")
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-7B-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Qwen/Qwen2-7B-Instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Qwen/Qwen2-7B-Instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Qwen/Qwen2-7B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Qwen/Qwen2-7B-Instruct

SGLang

How to use Qwen/Qwen2-7B-Instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Qwen/Qwen2-7B-Instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Qwen/Qwen2-7B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Qwen/Qwen2-7B-Instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Qwen/Qwen2-7B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Qwen/Qwen2-7B-Instruct with Docker Model Runner:
```
docker model run hf.co/Qwen/Qwen2-7B-Instruct
```

Math problems

by saishf - opened Jun 12, 2024

Discussion

saishf

Jun 12, 2024

I gave the model a random equation to solve and the model can't manage to get it right.
I ask: "What is 2 + 2 × 7 ÷ 6 + 7 ÷ (2 × 2) × (21 × 32)"
Qwen2-7B instruct responds:
To solve the expression (2 + 2 \times 7 \div 6 + 7 \div (2 \times 2) \times (21 \times 32)), we need to follow the order of operations, often remembered by the acronym PEMDAS: Parentheses, Exponents, Multiplication and Division from left to right, Addition and Subtraction from left to right.

Let's solve it step-by-step:

Step 1: Solve inside Parentheses

(2 \times 7)
(2 \times 2)
(21 \times 32)

Step 2: Perform Multiplication and Division in order from left to right

Step 3: Add the remaining numbers

Let's calculate:

Inside Parentheses:

(2 \times 7 = 14)
(2 \times 2 = 4)
(21 \times 32 = 672)

So, the expression becomes:
[2 + 14 \div 6 + 7 \div 4 \times 672]

Next Steps:

Now we solve multiplication and division from left to right:

(14 \div 6 = 2.33...) (approximated for simplification, exact value would be more precise)
(7 \div 4 = 1.75)
Then multiply this result by 672:
- (1.75 \times 672)

Let's calculate these:

Calculation of Multiplication and Division:

(14 \div 6 = 2.33...) (approximated for simplicity)
(7 \div 4 = 1.75)
[1.75 \times 672 = 1188]

So the expression now looks like:
[2 + 2.33... + 1188]

Final Addition:

Add all numbers together:

[2 + 2.33... + 1188 = 1192.33...]

Rounding to the nearest whole number or keeping it precise depending on context, we might express this as approximately (1192) if rounding was intended or precisely (1192.33...).

Therefore, following the order of operations and calculations:

[2 + 2 \times 7 \div 6 + 7 \div (2 \times 2) \times (21 \times 32)] equals approximately (1192) or precisely (1192.33...), depending on rounding rules applied.

Yi-9B responds:
To solve this problem, we'll follow the order of operations, which is often remembered by the acronym PEMDAS: Parentheses, Exponents, Multiplication and Division (from left to right), Addition and Subtraction (from left to right).

Given expression: 2 + 2 × 7 ÷ 6 + 7 ÷ (2 × 2) × (21 × 32)

First, we'll solve the parentheses and exponents (there are none in this problem), then multiplication and division from left to right, followed by addition and subtraction from left to right.

Let's break it down step by step:

Multiplication inside parentheses: 2 × 2 = 4
Updated expression: 2 + 2 × 7 ÷ 6 + 7 ÷ 4 × (21 × 32)
Multiplication and division from left to right:
2 × 7 = 14; 14 ÷ 6 = 2.3333 (rounded to four decimal places)
Updated expression: 2 + 2.3333 + 7 ÷ 4 × (21 × 32)
Multiplication inside brackets: 21 × 32 = 672
Updated expression: 2 + 2.3333 + 7 ÷ 4 × 672
Division: 7 ÷ 4 = 1.75
Updated expression: 2 + 2.3333 + 1.75 × 672
Multiplication: 1.75 × 672 = 1176
Final expression: 2 + 2.3333 + 1176
Addition from left to right:
2 + 2.3333 = 4.3333; 4.3333 + 1176 = 1180.3333

So, the result of 2 + 2 × 7 ÷ 6 + 7 ÷ (2 × 2) × (21 × 32) is approximately 1180.3333 when rounded to four decimal places.

The Gemini app responds:
I've calculated the expression following the order of operations (PEMDAS) and accounting for implicit multiplication. The answer is 1,180.33333.

And my pixel's calculator responds:
1180.3333333333

Both local models are being run under ollama with no changes to configs.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment