Ali-Yaser commited on
Commit
06cd72a
·
verified ·
1 Parent(s): 1a1d1d2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -48
README.md CHANGED
@@ -43,59 +43,17 @@ language:
43
  ## 🚀 Quick Start
44
 
45
  ### Installation
46
- The code of Qwen3 has been in the latest Hugging Face `transformers` and we advise you to use the latest version of `transformers`.
47
 
48
- With `transformers<4.51.0`, you will encounter the following error:
49
  ```
50
- KeyError: 'qwen3'
 
51
  ```
52
 
53
- The following contains a code snippet illustrating how to use the model generate content based on given inputs.
54
  ```python
55
- from transformers import AutoModelForCausalLM, AutoTokenizer
56
-
57
- model_name = "Qwen/Qwen3-8B"
58
-
59
- # load the tokenizer and the model
60
- tokenizer = AutoTokenizer.from_pretrained(model_name)
61
- model = AutoModelForCausalLM.from_pretrained(
62
- model_name,
63
- torch_dtype="auto",
64
- device_map="auto"
65
- )
66
-
67
- # prepare the model input
68
- prompt = "Give me a short introduction to large language model."
69
- messages = [
70
- {"role": "user", "content": prompt}
71
- ]
72
- text = tokenizer.apply_chat_template(
73
- messages,
74
- tokenize=False,
75
- add_generation_prompt=True,
76
- enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
77
- )
78
- model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
79
-
80
- # conduct text completion
81
- generated_ids = model.generate(
82
- **model_inputs,
83
- max_new_tokens=32768
84
- )
85
- output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
86
-
87
- # parsing thinking content
88
- try:
89
- # rindex finding 151668 (</think>)
90
- index = len(output_ids) - output_ids[::-1].index(151668)
91
- except ValueError:
92
- index = 0
93
-
94
- thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
95
- content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")
96
-
97
- print("thinking content:", thinking_content)
98
- print("content:", content)
99
  ```
100
 
101
  For deployment, you can use `sglang>=0.4.6.post1` or `vllm>=0.8.5` or to create an OpenAI-compatible API endpoint:
 
43
  ## 🚀 Quick Start
44
 
45
  ### Installation
46
+ I use a vLLM
47
 
 
48
  ```
49
+ # Install vLLM from pip:
50
+ pip install vllm
51
  ```
52
 
53
+ and lets download the model and run model
54
  ```python
55
+ # Load and run the model:
56
+ vllm serve "Ali-Yaser/Qwen3-R1-8B"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  ```
58
 
59
  For deployment, you can use `sglang>=0.4.6.post1` or `vllm>=0.8.5` or to create an OpenAI-compatible API endpoint: