madcows
/

siwon-mini-instruct-0626

Text Generation

instruction-tuning

text-generation-inference

Model card Files Files and versions

madcows commited on Jun 26, 2025

Commit

e147c0d

·

verified ·

1 Parent(s): f1bdc17

Update README.md

Files changed (1) hide show

README.md +46 -0

README.md CHANGED Viewed

@@ -50,6 +50,52 @@ The chat template was updated accordingly to support multi-turn conversation for
 {% if add_generation_prompt %}{{ '<|assistant|>' }}{% endif %}
 ```
 ## 📌 Caution
 * Commercial use is strictly prohibited.

 {% if add_generation_prompt %}{{ '<|assistant|>' }}{% endif %}
 ```
+## 🧪 Inference with Transformers
+Below is an example of how to load and use the model with the adjusted tokenizer, token IDs, and custom prompt template.
+> **Note**: This model uses a custom `chat_template` and updated special token IDs:
+> - `<|end|>` → 200020 (EOS)
+> - `<|dummy_85|>` → 200029 (PAD)
+> - `ï¿½` → 200030 (UNK)
+>
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model_path = "madcows/siwon-mini-instruct-0626"
+model = AutoModelForCausalLM.from_pretrained(
+    model_path,
+    device_map="auto",
+    torch_dtype=torch.bfloat16,
+    trust_remote_code=True
+)
+tokenizer = AutoTokenizer.from_pretrained(
+    model_path,
+    trust_remote_code=True,
+)
+messages = [
+        {"role": "system", "content": "You are a helpful assistant."},
+        {"role": "user", "content": "안녕하세요."},
+    ]
+inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
+output = model.generate(
+    **inputs,
+    max_new_tokens=2048,
+    # do_sample=True, # Optional
+    # top_p=0.95, # Optional
+    # temperature=0.6, # Optional
+    # repetition_penalty=1.1, # Optional
+)
+response = tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
+print(response)
+```
 ## 📌 Caution
 * Commercial use is strictly prohibited.