RichardErkhov commited on
Commit
994b77f
·
verified ·
1 Parent(s): 5b959f1

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +126 -0
README.md ADDED
@@ -0,0 +1,126 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ phi-2-super - bnb 8bits
11
+ - Model creator: https://huggingface.co/abacaj/
12
+ - Original model: https://huggingface.co/abacaj/phi-2-super/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+ ---
19
+ license: mit
20
+ license_link: https://huggingface.co/microsoft/phi-2/resolve/main/LICENSE
21
+ language:
22
+ - en
23
+ widget:
24
+ - text: Hello who are you?
25
+ example_title: Identity
26
+ - text: What can you do?
27
+ example_title: Capabilities
28
+ - text: Create a fastapi endpoint to retrieve the weather given a zip code.
29
+ example_title: Coding
30
+ tags:
31
+ - convAI
32
+ - conversational
33
+ pipeline_tag: text-generation
34
+ model-index:
35
+ - name: phi-2-super
36
+ results:
37
+ # IFEval
38
+ - task:
39
+ type: text-generation
40
+ name: Text Generation
41
+ dataset:
42
+ name: Instruction Following Eval
43
+ type: wis-k/instruction-following-eval
44
+ metrics:
45
+ - type: acc
46
+ name: prompt_level_loose_acc
47
+ value: 0.2717
48
+ source:
49
+ name: LightEval
50
+ url: https://github.com/huggingface/lighteval
51
+ ---
52
+ # Phi-2-super (SFT + cDPO)
53
+
54
+ Base Model: [microsoft/phi-2](https://huggingface.co/microsoft/phi-2)
55
+
56
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/5-LQCMrXi8FN_ewcWL47v.png)
57
+
58
+ # How to run inference:
59
+
60
+ ```python
61
+ import transformers
62
+ import torch
63
+
64
+ if __name__ == "__main__":
65
+ model_name = "abacaj/phi-2-super"
66
+ tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
67
+
68
+ model = (
69
+ transformers.AutoModelForCausalLM.from_pretrained(
70
+ model_name,
71
+ )
72
+ .to("cuda:0")
73
+ .eval()
74
+ )
75
+
76
+ messages = [
77
+ {"role": "user", "content": "Hello, who are you?"}
78
+ ]
79
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
80
+ input_ids_cutoff = inputs.size(dim=1)
81
+
82
+ with torch.no_grad():
83
+ generated_ids = model.generate(
84
+ input_ids=inputs,
85
+ use_cache=True,
86
+ max_new_tokens=512,
87
+ temperature=0.2,
88
+ top_p=0.95,
89
+ do_sample=True,
90
+ eos_token_id=tokenizer.eos_token_id,
91
+ pad_token_id=tokenizer.pad_token_id,
92
+ )
93
+
94
+ completion = tokenizer.decode(
95
+ generated_ids[0][input_ids_cutoff:],
96
+ skip_special_tokens=True,
97
+ )
98
+
99
+ print(completion)
100
+ ```
101
+
102
+ # Chat template
103
+
104
+ The model uses the same chat template as found in Mistral instruct models:
105
+
106
+ ```python
107
+ text = "<|endoftext|>[INST] What is your favourite condiment? [/INST]"
108
+ "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!<|endoftext|> "
109
+ "[INST] Do you have mayonnaise recipes? [/INST]"
110
+ ```
111
+
112
+ You don't need to do it manually if you use the HF transformers tokenizer:
113
+
114
+ ```python
115
+ messages = [
116
+ {"role": "user", "content": "Hello, who are you?"},
117
+ {"role": "assistant": "content": "I am ..."}
118
+ ]
119
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
120
+ ```
121
+
122
+ # MT-bench / heval
123
+
124
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/lnFu3x1ufdpQVysIrX4-G.png)
125
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/mJfBpH8dIW7Ii2KAGI_A7.png)
126
+