DisgustingOzil
/

phi-2-riddler

Model card Files Files and versions

DisgustingOzil commited on Mar 21, 2024

Commit

8ec9a4f

·

verified ·

1 Parent(s): 373b635

Updated

Files changed (1) hide show

README.md +38 -0

README.md CHANGED Viewed

@@ -38,6 +38,44 @@ This is the model card of a 🤗 transformers model that has been pushed on the
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
+import torch
+from transformers import AutoModelForCausalLM, BitsAndBytesConfig, set_seed
+# set seed
+set_seed(42)
+# Load model
+modelpath = "DisgustingOzil/phi-2-riddler"
+model = AutoModelForCausalLM.from_pretrained(
+    modelpath,
+    device_map="auto",
+    quantization_config=BitsAndBytesConfig(
+        load_in_4bit=True,
+        bnb_4bit_compute_dtype=torch.float16,
+        bnb_4bit_quant_type="nf4",
+    ),
+    torch_dtype=torch.float16,
+)
+from transformers import AutoTokenizer
+tokenizer = AutoTokenizer.from_pretrained(modelpath, use_fast=False)
+question = "Why life is so difficult of life?"
+messages = [
+    {"role": "user", "content": question},
+]
+input_tokens = tokenizer.apply_chat_template(
+    messages,
+    add_generation_prompt=True,
+    return_tensors="pt"
+).to("cuda")
+output_tokens = model.generate(input_tokens, max_new_tokens=200)
+output = tokenizer.decode(output_tokens[0])
+print(output)
+# fast tokenizer sometimes ignores the added tokens
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->