PingVortexLM 0.5B
Collection
Series of our instruct models • 2 items • Updated
A fine-tuned version of Qwen/Qwen2.5-0.5B trained on custom English conversational data. This model is not aimed at coding or multilingual use, just solid general English conversation.
Built by PingVortex Labs.
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "pvlabs/PingVortexLM-0.5B-v1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, dtype=torch.bfloat16, device_map="auto")
def chat(user_message):
prompt = (
f"<|im_start|>system\nYou are a helpful assistant<|im_end|>\n"
f"<|im_start|>user\n{user_message}<|im_end|>\n"
f"<|im_start|>assistant\n"
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(
**inputs,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
top_p=0.9,
pad_token_id=tokenizer.eos_token_id,
)
response = tokenizer.decode(output[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
return response
print(chat("Hello"))
The model uses the standard ChatML format:
<|im_start|>system
You are a helpful assistant<|im_end|>
<|im_start|>user
Your message here<|im_end|>
<|im_start|>assistant
It is recommended to always include the system prompt.
Made by PingVortex.