Writer
/

palmyra-mini-thinking-b

Text Generation

Model card Files Files and versions

tperes commited on Sep 11, 2025

Commit

8973a6d

·

verified ·

1 Parent(s): 92eb34c

Update README.md

Files changed (1) hide show

README.md +44 -0

README.md CHANGED Viewed

@@ -50,6 +50,50 @@ Beyond mathematics, Palmyra-mini-thinking-b demonstrates strong performance in t
 | HMMT23 (extractive_match)                                        | 0.2333   |
 | Average                                                          | 0.359378 |
 ## Ethical Considerations

 | HMMT23 (extractive_match)                                        | 0.2333   |
 | Average                                                          | 0.359378 |
+### Use with transformers
+You can run conversational inference using the Transformers Auto classes with the `generate()` function. Here's an example:
+```py
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model_id = "Writer/palmyra-mini-thinking-a"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.float16,
+    device_map="auto",
+    attn_implementation="flash_attention_2",
+)
+messages = [
+      {
+        "role": "user",
+        "content": "You have a 3-liter jug and a 5-liter jug. How can you measure exactly 4 liters of water?"
+      }
+    ],
+input_ids = tokenizer.apply_chat_template(
+    messages, tokenize=True, add_generation_prompt=True, return_tensors="pt"
+)
+gen_conf = {
+    "max_new_tokens": 256,
+    "eos_token_id": tokenizer.eos_token_id,
+    "temperature": 0.3,
+    "top_p": 0.9,
+}
+with torch.inference_mode():
+    output_id = model.generate(input_ids, **gen_conf)
+output_text = tokenizer.decode(output_id[0][input_ids.shape[1] :])
+print(output_text)
+```
 ## Ethical Considerations