DisgustingOzil commited on
Commit
8ec9a4f
·
verified ·
1 Parent(s): 373b635
Files changed (1) hide show
  1. README.md +38 -0
README.md CHANGED
@@ -38,6 +38,44 @@ This is the model card of a 🤗 transformers model that has been pushed on the
38
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
 
40
  ### Direct Use
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
 
 
38
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
 
40
  ### Direct Use
41
+ import torch
42
+ from transformers import AutoModelForCausalLM, BitsAndBytesConfig, set_seed
43
+
44
+ # set seed
45
+ set_seed(42)
46
+
47
+ # Load model
48
+ modelpath = "DisgustingOzil/phi-2-riddler"
49
+ model = AutoModelForCausalLM.from_pretrained(
50
+ modelpath,
51
+ device_map="auto",
52
+ quantization_config=BitsAndBytesConfig(
53
+ load_in_4bit=True,
54
+ bnb_4bit_compute_dtype=torch.float16,
55
+ bnb_4bit_quant_type="nf4",
56
+ ),
57
+ torch_dtype=torch.float16,
58
+ )
59
+ from transformers import AutoTokenizer
60
+
61
+ tokenizer = AutoTokenizer.from_pretrained(modelpath, use_fast=False)
62
+ question = "Why life is so difficult of life?"
63
+ messages = [
64
+ {"role": "user", "content": question},
65
+ ]
66
+
67
+ input_tokens = tokenizer.apply_chat_template(
68
+ messages,
69
+ add_generation_prompt=True,
70
+ return_tensors="pt"
71
+ ).to("cuda")
72
+ output_tokens = model.generate(input_tokens, max_new_tokens=200)
73
+ output = tokenizer.decode(output_tokens[0])
74
+
75
+ print(output)
76
+
77
+
78
+ # fast tokenizer sometimes ignores the added tokens
79
 
80
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
81