Ali-Yaser
/

Qwen3-R1-8B

Text Generation

text-generation-inference

Model card Files Files and versions

Ali-Yaser commited on about 16 hours ago

Commit

06cd72a

·

verified ·

1 Parent(s): 1a1d1d2

Update README.md

Files changed (1) hide show

README.md +6 -48

README.md CHANGED Viewed

@@ -43,59 +43,17 @@ language:
 ## 🚀 Quick Start
 ### Installation
-The code of Qwen3 has been in the latest Hugging Face `transformers` and we advise you to use the latest version of `transformers`.
-With `transformers<4.51.0`, you will encounter the following error:
 ```
-KeyError: 'qwen3'
 ```
-The following contains a code snippet illustrating how to use the model generate content based on given inputs.
 ```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-model_name = "Qwen/Qwen3-8B"
-# load the tokenizer and the model
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-model = AutoModelForCausalLM.from_pretrained(
-    model_name,
-    torch_dtype="auto",
-    device_map="auto"
-)
-# prepare the model input
-prompt = "Give me a short introduction to large language model."
-messages = [
-    {"role": "user", "content": prompt}
-]
-text = tokenizer.apply_chat_template(
-    messages,
-    tokenize=False,
-    add_generation_prompt=True,
-    enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
-)
-model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
-# conduct text completion
-generated_ids = model.generate(
-    **model_inputs,
-    max_new_tokens=32768
-)
-output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
-# parsing thinking content
-try:
-    # rindex finding 151668 (</think>)
-    index = len(output_ids) - output_ids[::-1].index(151668)
-except ValueError:
-    index = 0
-thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
-content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")
-print("thinking content:", thinking_content)
-print("content:", content)
 ```
 For deployment, you can use `sglang>=0.4.6.post1` or `vllm>=0.8.5` or to create an OpenAI-compatible API endpoint:

 ## 🚀 Quick Start
 ### Installation
+I use a vLLM
 ```
+# Install vLLM from pip:
+pip install vllm
 ```
+and lets download the model and run model
 ```python
+# Load and run the model:
+vllm serve "Ali-Yaser/Qwen3-R1-8B"
 ```
 For deployment, you can use `sglang>=0.4.6.post1` or `vllm>=0.8.5` or to create an OpenAI-compatible API endpoint: