# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("mpasila/sunfall-SimPO-9B")
model = AutoModelForCausalLM.from_pretrained("mpasila/sunfall-SimPO-9B")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))Quick Links
Got this crazy idea that what if you put the LoRA from crestf411/sunfall-peft on top of princeton-nlp/gemma-2-9b-it-SimPO and therefore this exists solely for that purpose alone in the universe.
Also just realized the script used fp16 and not bf16 so i'll reupload the model. Crisis averted.
Prompt format should be Gemma 2 Instruct thing.
- Downloads last month
- 10
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="mpasila/sunfall-SimPO-9B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)