---
library_name: transformers
license: llama3
language:
- en
- fa
tags:
- LLM
- llama-3
- PishroBPMS
- conversational
base_model:
- meta-llama/Meta-Llama-3-8B-Instruct
pipeline_tag: text-generation
---
# Model Details

The pishro models are a family of decoder-only models, specifically fine-tuned on Processmaker data, developed by [PishroBPMS](https://pishrobpms.com/). As an initial release, an 8B instruct model from this family is being made available.
Pishro-Llama3-8B-Instruct is built using the [Meta Llama 3 Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model.


## How to use

You can run conversational inference using the Transformers Auto classes with the `generate()` function. Let's look at an example.

```Python
import torch
import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
messages = [
    {"role": "system",
     "content": "تو یک کارشناس ProcessMaker 4 و PHP هستی و باید فقط یک اسکریپت PHP استاندارد تولید کنی."},
    {"role": "user", "content": "یک اسکریپت PHP ساده برای جمع دو عدد در ProcessMaker 4 بنویس."},
]
input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)
terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = model.generate(
    input_ids,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
```