--- library_name: transformers license: llama3 language: - en - fa tags: - LLM - llama-3 - PishroBPMS - conversational base_model: - meta-llama/Meta-Llama-3-8B-Instruct pipeline_tag: text-generation --- # Model Details The pishro models are a family of decoder-only models, specifically fine-tuned on Processmaker data, developed by [PishroBPMS](https://pishrobpms.com/). As an initial release, an 8B instruct model from this family is being made available. Pishro-Llama3-8B-Instruct is built using the [Meta Llama 3 Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model. ## How to use You can run conversational inference using the Transformers Auto classes with the `generate()` function. Let's look at an example. ```Python import torch import transformers from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForCausalLM.from_pretrained( model_path, torch_dtype=torch.bfloat16, device_map="auto", ) messages = [ {"role": "system", "content": "تو یک کارشناس ProcessMaker 4 و PHP هستی و باید فقط یک اسکریپت PHP استاندارد تولید کنی."}, {"role": "user", "content": "یک اسکریپت PHP ساده برای جمع دو عدد در ProcessMaker 4 بنویس."}, ] input_ids = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors="pt" ).to(model.device) terminators = [ tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|eot_id|>") ] outputs = model.generate( input_ids, max_new_tokens=256, eos_token_id=terminators, do_sample=True, temperature=0.6, top_p=0.9, ) response = outputs[0][input_ids.shape[-1]:] print(tokenizer.decode(response, skip_special_tokens=True)) ```