Instructions to use Qwen/Qwen2.5-Coder-32B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Qwen/Qwen2.5-Coder-32B-Instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Qwen/Qwen2.5-Coder-32B-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct")
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
HuggingChat
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Qwen/Qwen2.5-Coder-32B-Instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Qwen/Qwen2.5-Coder-32B-Instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Qwen/Qwen2.5-Coder-32B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Qwen/Qwen2.5-Coder-32B-Instruct

SGLang

How to use Qwen/Qwen2.5-Coder-32B-Instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Qwen/Qwen2.5-Coder-32B-Instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Qwen/Qwen2.5-Coder-32B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Qwen/Qwen2.5-Coder-32B-Instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Qwen/Qwen2.5-Coder-32B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Qwen/Qwen2.5-Coder-32B-Instruct with Docker Model Runner:
```
docker model run hf.co/Qwen/Qwen2.5-Coder-32B-Instruct
```

Thieves!

#36

by supercharge19 - opened Jan 13, 2025

Discussion

supercharge19

Jan 13, 2025

On their website they say you will not be charged until you use a service. I wanted to explore they were offering so I just created an account with alibaba website and they tried to steal money from me!!! But luckily I'd the card blocked before they could steal it. Anyway beware of the thieves and don't do business with them. I don't think anybody does any business with them anyway, they are just exploited to do that.

And no; I DON'T WANT MONEY BACK, as you could not steal anything, I just want people to know not to do business with people who probably killed and ate their own owner.

yuyuyang1997

Jan 14, 2025

what happened?

CHNtentes

Jan 14, 2025

医院门没关？

supercharge19

Jan 14, 2025

@yuyuyang1997
I just wanted to try the model through API, small models are not good enough so I wanted to try large one, and on the alibaba website I saw that I would not be charged if I proceed with giving my credit or debit card information but as soon as I set it up I was charged 1 dollar for that. If you are going to do that without notifying people then it does sound like theft. Anyway, I believe there should be a warning before someone is charged. For example here is what oracle cloud is doing:

They are clear about it. Anyway, I feel I got angrier than I should have so I would remove this thread after you have read it (let me know). But also update information in alibaba website (or tell them they should, if you could, otherwise confusion will rise).

CHNtentes

Jan 14, 2025

Is that really a huge problem? I got charged when activating Google Maps API.

yuyuyang1997

Jan 14, 2025

@supercharge19 I see. They should definitely clarify this on their website. Fortunately, this will not result in an actual deduction.

ZiggyS

Jan 14, 2025

A small startup charge sounds normal in the credit card world, as a 'test holding fee'. Normally it gets refunded later.

supercharge19

Jan 17, 2025

A small startup charge sounds normal in the credit card world, as a 'test holding fee'. Normally it gets refunded later.

Should have been mentioned however that much is.

Daemontatox

Jan 26, 2025

They are not thieves, its not their fault you dont understand how credit cards verification works,
Its not mentioned in most sites because it became a custom for service providers to deduct 1 usd to verify that the card is not randomly generated and happened to bypass the card number check.
Its then refunded back to you withing 3-5 working days.
I am 100% sure that a huge company like Alibaba that has multiple data centers and gpu for training llms , wouldn't be interested in stealing your 1$.
So i advise you to get educated first before making accusations.

supercharge19

Jan 26, 2025

Closing now

supercharge19 changed discussion status to closed Jan 26, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment