pishrobpms's picture
Update README.md
bc5fc53 verified
---
library_name: transformers
license: llama3
language:
- en
- fa
tags:
- LLM
- llama-3
- PishroBPMS
- conversational
base_model:
- meta-llama/Meta-Llama-3-8B-Instruct
pipeline_tag: text-generation
---
# Model Details
The pishro models are a family of decoder-only models, specifically fine-tuned on Processmaker data, developed by [PishroBPMS](https://pishrobpms.com/). As an initial release, an 8B instruct model from this family is being made available.
Pishro-Llama3-8B-Instruct is built using the [Meta Llama 3 Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model.
## How to use
You can run conversational inference using the Transformers Auto classes with the `generate()` function. Let's look at an example.
```Python
import torch
import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
model_path,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system",
"content": "تو یک کارشناس ProcessMaker 4 و PHP هستی و باید فقط یک اسکریپت PHP استاندارد تولید کنی."},
{"role": "user", "content": "یک اسکریپت PHP ساده برای جمع دو عدد در ProcessMaker 4 بنویس."},
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = model.generate(
input_ids,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
```