Instructions to use Qwen/Qwen2.5-Coder-32B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Qwen/Qwen2.5-Coder-32B-Instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Qwen/Qwen2.5-Coder-32B-Instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct") model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- HuggingChat
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Qwen/Qwen2.5-Coder-32B-Instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Qwen/Qwen2.5-Coder-32B-Instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Qwen/Qwen2.5-Coder-32B-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Qwen/Qwen2.5-Coder-32B-Instruct
- SGLang
How to use Qwen/Qwen2.5-Coder-32B-Instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Qwen/Qwen2.5-Coder-32B-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Qwen/Qwen2.5-Coder-32B-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Qwen/Qwen2.5-Coder-32B-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Qwen/Qwen2.5-Coder-32B-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Qwen/Qwen2.5-Coder-32B-Instruct with Docker Model Runner:
docker model run hf.co/Qwen/Qwen2.5-Coder-32B-Instruct
Thieves!
On their website they say you will not be charged until you use a service. I wanted to explore they were offering so I just created an account with alibaba website and they tried to steal money from me!!! But luckily I'd the card blocked before they could steal it. Anyway beware of the thieves and don't do business with them. I don't think anybody does any business with them anyway, they are just exploited to do that.
And no; I DON'T WANT MONEY BACK, as you could not steal anything, I just want people to know not to do business with people who probably killed and ate their own owner.
what happened?
医院门没关?
@yuyuyang1997
I just wanted to try the model through API, small models are not good enough so I wanted to try large one, and on the alibaba website I saw that I would not be charged if I proceed with giving my credit or debit card information but as soon as I set it up I was charged 1 dollar for that. If you are going to do that without notifying people then it does sound like theft. Anyway, I believe there should be a warning before someone is charged. For example here is what oracle cloud is doing:

They are clear about it. Anyway, I feel I got angrier than I should have so I would remove this thread after you have read it (let me know). But also update information in alibaba website (or tell them they should, if you could, otherwise confusion will rise).
Is that really a huge problem? I got charged when activating Google Maps API.
@supercharge19 I see. They should definitely clarify this on their website. Fortunately, this will not result in an actual deduction.
A small startup charge sounds normal in the credit card world, as a 'test holding fee'. Normally it gets refunded later.
A small startup charge sounds normal in the credit card world, as a 'test holding fee'. Normally it gets refunded later.
Should have been mentioned however that much is.
They are not thieves, its not their fault you dont understand how credit cards verification works,
Its not mentioned in most sites because it became a custom for service providers to deduct 1 usd to verify that the card is not randomly generated and happened to bypass the card number check.
Its then refunded back to you withing 3-5 working days.
I am 100% sure that a huge company like Alibaba that has multiple data centers and gpu for training llms , wouldn't be interested in stealing your 1$.
So i advise you to get educated first before making accusations.
Closing now