metadata
library_name: transformers
license: llama3
language:
- en
- fa
tags:
- LLM
- llama-3
- PishroBPMS
- conversational
base_model:
- meta-llama/Meta-Llama-3-8B-Instruct
pipeline_tag: text-generation
Model Details
The pishro models are a family of decoder-only models, specifically fine-tuned on Processmaker data, developed by PishroBPMS. As an initial release, an 8B instruct model from this family is being made available. Pishro-Llama3-8B-Instruct is built using the Meta Llama 3 Instruct model.
How to use
You can run conversational inference using the Transformers Auto classes with the generate() function. Let's look at an example.
import torch
import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
model_path,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system",
"content": "تو یک کارشناس ProcessMaker 4 و PHP هستی و باید فقط یک اسکریپت PHP استاندارد تولید کنی."},
{"role": "user", "content": "یک اسکریپت PHP ساده برای جمع دو عدد در ProcessMaker 4 بنویس."},
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = model.generate(
input_ids,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))