Pishro-Llama3-8B-Instruct / README.md

pishrobpms

Update README.md

bc5fc53 verified 5 months ago

preview code

raw

history blame contribute delete

1.83 kB

metadata

library_name: transformers
license: llama3
language:
  - en
  - fa
tags:
  - LLM
  - llama-3
  - PishroBPMS
  - conversational
base_model:
  - meta-llama/Meta-Llama-3-8B-Instruct
pipeline_tag: text-generation

Model Details

The pishro models are a family of decoder-only models, specifically fine-tuned on Processmaker data, developed by PishroBPMS. As an initial release, an 8B instruct model from this family is being made available. Pishro-Llama3-8B-Instruct is built using the Meta Llama 3 Instruct model.

How to use

You can run conversational inference using the Transformers Auto classes with the generate() function. Let's look at an example.

import torch
import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
messages = [
    {"role": "system",
     "content": "تو یک کارشناس ProcessMaker 4 و PHP هستی و باید فقط یک اسکریپت PHP استاندارد تولید کنی."},
    {"role": "user", "content": "یک اسکریپت PHP ساده برای جمع دو عدد در ProcessMaker 4 بنویس."},
]
input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)
terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = model.generate(
    input_ids,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))