tperes commited on
Commit
8973a6d
·
verified ·
1 Parent(s): 92eb34c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md CHANGED
@@ -50,6 +50,50 @@ Beyond mathematics, Palmyra-mini-thinking-b demonstrates strong performance in t
50
  | HMMT23 (extractive_match) | 0.2333 |
51
  | Average | 0.359378 |
52
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
 
54
  ## Ethical Considerations
55
 
 
50
  | HMMT23 (extractive_match) | 0.2333 |
51
  | Average | 0.359378 |
52
 
53
+ ### Use with transformers
54
+
55
+ You can run conversational inference using the Transformers Auto classes with the `generate()` function. Here's an example:
56
+
57
+ ```py
58
+ import torch
59
+ from transformers import AutoTokenizer, AutoModelForCausalLM
60
+
61
+ model_id = "Writer/palmyra-mini-thinking-a"
62
+
63
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
64
+
65
+ model = AutoModelForCausalLM.from_pretrained(
66
+ model_id,
67
+ torch_dtype=torch.float16,
68
+ device_map="auto",
69
+ attn_implementation="flash_attention_2",
70
+ )
71
+
72
+ messages = [
73
+ {
74
+ "role": "user",
75
+ "content": "You have a 3-liter jug and a 5-liter jug. How can you measure exactly 4 liters of water?"
76
+ }
77
+ ],
78
+
79
+ input_ids = tokenizer.apply_chat_template(
80
+ messages, tokenize=True, add_generation_prompt=True, return_tensors="pt"
81
+ )
82
+
83
+ gen_conf = {
84
+ "max_new_tokens": 256,
85
+ "eos_token_id": tokenizer.eos_token_id,
86
+ "temperature": 0.3,
87
+ "top_p": 0.9,
88
+ }
89
+
90
+ with torch.inference_mode():
91
+ output_id = model.generate(input_ids, **gen_conf)
92
+
93
+ output_text = tokenizer.decode(output_id[0][input_ids.shape[1] :])
94
+
95
+ print(output_text)
96
+ ```
97
 
98
  ## Ethical Considerations
99