Instructions to use WeiboAI/VibeThinker-3B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use WeiboAI/VibeThinker-3B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="WeiboAI/VibeThinker-3B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("WeiboAI/VibeThinker-3B") model = AutoModelForCausalLM.from_pretrained("WeiboAI/VibeThinker-3B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Local Apps Settings
- vLLM
How to use WeiboAI/VibeThinker-3B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "WeiboAI/VibeThinker-3B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "WeiboAI/VibeThinker-3B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/WeiboAI/VibeThinker-3B
- SGLang
How to use WeiboAI/VibeThinker-3B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "WeiboAI/VibeThinker-3B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "WeiboAI/VibeThinker-3B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "WeiboAI/VibeThinker-3B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "WeiboAI/VibeThinker-3B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use WeiboAI/VibeThinker-3B with Docker Model Runner:
docker model run hf.co/WeiboAI/VibeThinker-3B
It's a very strong model for what it is trained! Bravo!
I gave it quite a few hard problems, within the domain, focused on programming. It performs exceptionally well IMO. Here are a few noteworthy prompts:
https://gist.github.com/codingquark/8df7f68eaafdabbae3498f7e083a0861
https://gist.github.com/codingquark/beaf63593c96ff52bf289d44167433b2
https://gist.github.com/codingquark/091e0b04bee6ccc9f53e80d9e0587c03
Just to give an idea quickly in case you don't want to go through the gists, these are the prompts:
- Place 4 rooks on a 6Γ6 chessboard so that no two share a row or column, and no rook sits on a cell (i,i) of the main diagonal (rows and columns numbered 1β6). How many such placements are there? Give the final answer as \boxed{N}
- Write a self-contained Python function count_dominant(a: list[int]) -> int returning the number of contiguous subarrays that are dominant: a subarray is dominant if its maximum element is strictly greater than the sum of all its other elements. For a length-1 subarray there are no other elements, so the "sum of others" is 0 (so it's dominant iff its single element is > 0). The array may contain negatives and duplicates. Aim for better than the brute-force O(n^2) and state the complexity you reach. Explain your approach, then hand-trace on a=[3,-1,2,-5,4], a=[5,5,1], and a=[-2,-3,-1], giving the return value for each.
- Let N be the number of functions f:{1,β¦,8}β{1,β¦,8} such that f(f(x))=f(x) for every x, and such that there are exactly 3 values x with f(x)=xf. Find N and give it as \boxed{N}.
Thanks a lot for your positive feedback and fantastic professional test cases! By the way, we ran a large number of OOD data tests for the usage scope mentioned in the technical report before launching this version, so we have solid confidence in its generalization performance on such issues.