almaghrabima commited on
Commit
0dc9d6a
ยท
verified ยท
1 Parent(s): caa8848

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -18
README.md CHANGED
@@ -25,16 +25,14 @@ ALLaM-Thinking is an advanced Arabic Large Language Model specifically optimized
25
  ## Usage Example
26
 
27
  ```python
28
- from transformers import AutoTokenizer, AutoModelForCausalLM
29
- from vllm import SamplingParams
30
- import torch
31
 
32
- # Load the model and tokenizer
33
  tokenizer = AutoTokenizer.from_pretrained("almaghrabima/ALLaM-Thinking")
34
- model = AutoModelForCausalLM.from_pretrained("almaghrabima/ALLaM-Thinking")
35
 
36
- # For faster inference with vLLM
37
- from vllm import LLM, SamplingParams
38
  model = LLM(model="almaghrabima/ALLaM-Thinking")
39
 
40
  # Format the prompt using chat template
@@ -50,25 +48,25 @@ sampling_params = SamplingParams(
50
  )
51
 
52
  # Generate response
53
- output = model.fast_generate(
54
- [text],
55
- sampling_params=sampling_params,
56
- lora_request=None,
57
- )[0].outputs[0].text
58
-
59
  print(output)
60
  ```
61
 
62
- ## Example Answer
63
 
64
  ```
65
- ุฃูˆู„ุงู‹ุŒ ู†ุญุชุงุฌ ุฅู„ู‰ ุชุญุฏูŠุฏ ุนุฏุฏ ุงู„ู„ุงุนุจูŠู† ุงู„ุฐูŠู† ูŠุณุฌู„ูˆู† ุงู„ุฃู‡ุฏุงู ู…ู† ุฅุฌู…ุงู„ูŠ ุนุฏุฏ ุงู„ู„ุงุนุจูŠู†.
 
 
 
 
66
 
67
- 40% ู…ู† 15 ู„ุงุนุจุงู‹ = 0.40 * 15 = 6 ู„ุงุนุจูŠู† ูŠุณุฌู„ูˆู† ุงู„ุฃู‡ุฏุงู.
68
 
69
- ุจุนุฏ ุฐู„ูƒุŒ ุฅุฐุง ุณุฌู„ ูƒู„ ู„ุงุนุจ ู…ู† ุงู„ู„ุงุนุจูŠู† ุงู„ุฐูŠู† ูŠุณุฌู„ูˆู† ุงู„ุฃู‡ุฏุงู ููŠ ุงู„ู…ุชูˆุณุท 5 ุฃู‡ุฏุงูุŒ ูุฅู† ุงู„ุนุฏุฏ ุงู„ุฅุฌู…ุงู„ูŠ ู„ู„ุฃู‡ุฏุงู ุงู„ู…ุณุฌู„ุฉ ู…ู† ู‚ุจู„ ุงู„ู„ุงุนุจูŠู† ุงู„ุฐูŠู† ูŠุณุฌู„ูˆู† ุงู„ุฃู‡ุฏุงู ูŠูƒูˆู†:
70
 
71
- 6 ู„ุงุนุจูŠู† * 5 ุฃู‡ุฏุงู ู„ูƒู„ ู„ุงุนุจ = 30 ู‡ุฏูุงู‹.
72
  ```
73
 
74
  ## Unsloth Optimization
 
25
  ## Usage Example
26
 
27
  ```python
28
+ from transformers import AutoTokenizer
29
+ from vllm import LLM, SamplingParams
 
30
 
31
+ # Load the tokenizer
32
  tokenizer = AutoTokenizer.from_pretrained("almaghrabima/ALLaM-Thinking")
 
33
 
34
+ # Initialize the model using vLLM
35
+ # Note: You should only initialize the model once, using vLLM directly
36
  model = LLM(model="almaghrabima/ALLaM-Thinking")
37
 
38
  # Format the prompt using chat template
 
48
  )
49
 
50
  # Generate response
51
+ outputs = model.generate([text], sampling_params)
52
+ output = outputs[0].outputs[0].text
 
 
 
 
53
  print(output)
54
  ```
55
 
56
+ ## Answer
57
 
58
  ```
59
+ ุฃูˆู„ุงู‹ุŒ ุฏุนู†ุง ู†ุฌุฏ ุนุฏุฏ ุงู„ู„ุงุนุจูŠู† ุงู„ุฐูŠู† ูŠุณุฌู„ูˆู† ุงู„ุฃู‡ุฏุงู.
60
+
61
+ 40% ู…ู† 15 ู„ุงุนุจุงู‹ ูŠุณุงูˆูŠ:
62
+
63
+ 0.40 * 15 = 6 ู„ุงุนุจูŠู†
64
 
65
+ ุงู„ุขู†ุŒ ุฅุฐุง ูƒุงู† ูƒู„ ู„ุงุนุจ ู…ู† ู‡ุคู„ุงุก ุงู„ู„ุงุนุจูŠู† ุงู„ุณุชุฉ ูŠุณุฌู„ ููŠ ุงู„ู…ุชูˆุณุท 5 ุฃู‡ุฏุงู ุฎู„ุงู„ ุงู„ู…ูˆุณู…ุŒ ูุฅู† ุฅุฌู…ุงู„ูŠ ุนุฏุฏ ุงู„ุฃู‡ุฏุงู ุงู„ุชูŠ ุณุฌู„ู‡ุง ุงู„ู„ุงุนุจูˆู† ุงู„ุฐูŠู† ูŠุณุฌู„ูˆู† ุงู„ุฃู‡ุฏุงู ุณูŠูƒูˆู†:
66
 
67
+ 6 ู„ุงุนุจูŠู† * 5 ุฃู‡ุฏุงู ู„ูƒู„ ู„ุงุนุจ = 30 ู‡ุฏูุงู‹
68
 
69
+ ู„ุฐู„ูƒุŒ ุณุฌู„ ุงู„ู„ุงุนุจูˆู† ุงู„ุฐูŠู† ูŠุณุฌู„ูˆู† ุงู„ุฃู‡ุฏุงู ู…ุฌู…ูˆุน 30 ู‡ุฏูุงู‹ ุฎู„ุงู„ ุงู„ู…ูˆุณู….
70
  ```
71
 
72
  ## Unsloth Optimization