g-ronimo commited on
Commit
ee382f2
·
verified ·
1 Parent(s): 3132899

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -31
README.md CHANGED
@@ -13,36 +13,6 @@ datasets:
13
  * max. seq. length: 1024 tokens
14
  * code in code/
15
 
16
- ## Inference
17
- ```python
18
- from transformers import AutoModelForCausalLM, AutoTokenizer
19
- import torch
20
-
21
- modelpath="g-ronimo/phi-2-OpenHermes-2.5"
22
-
23
- model = AutoModelForCausalLM.from_pretrained(
24
- modelpath,
25
- torch_dtype=torch.bfloat16,
26
- device_map="auto",
27
- # attn_implementation="flash_attention_2",
28
- )
29
- tokenizer = AutoTokenizer.from_pretrained(modelpath)
30
-
31
- messages = [
32
- {"role": "user", "content": "what does it mean to be successful?"},
33
- ]
34
-
35
- input_tokens = tokenizer.apply_chat_template(
36
- messages,
37
- add_generation_prompt=True,
38
- return_tensors="pt"
39
- ).to("cuda")
40
- output_tokens = model.generate(input_tokens, max_new_tokens=500)
41
- output = tokenizer.decode(output_tokens[0])
42
-
43
- print(output)
44
- ```
45
-
46
  ## Evals
47
 
48
  | Model |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
@@ -125,4 +95,32 @@ Average: 35.9%
125
 
126
  Average score: 45.3%
127
 
128
- Elapsed time: 01:24:18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  * max. seq. length: 1024 tokens
14
  * code in code/
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  ## Evals
17
 
18
  | Model |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
 
95
 
96
  Average score: 45.3%
97
 
98
+ ## Inference
99
+ ```python
100
+ from transformers import AutoModelForCausalLM, AutoTokenizer
101
+ import torch
102
+
103
+ modelpath="g-ronimo/phi-2-OpenHermes-2.5"
104
+
105
+ model = AutoModelForCausalLM.from_pretrained(
106
+ modelpath,
107
+ torch_dtype=torch.bfloat16,
108
+ device_map="auto",
109
+ # attn_implementation="flash_attention_2",
110
+ )
111
+ tokenizer = AutoTokenizer.from_pretrained(modelpath)
112
+
113
+ messages = [
114
+ {"role": "user", "content": "what does it mean to be successful?"},
115
+ ]
116
+
117
+ input_tokens = tokenizer.apply_chat_template(
118
+ messages,
119
+ add_generation_prompt=True,
120
+ return_tensors="pt"
121
+ ).to("cuda")
122
+ output_tokens = model.generate(input_tokens, max_new_tokens=500)
123
+ output = tokenizer.decode(output_tokens[0])
124
+
125
+ print(output)
126
+ ```