SeewebLLM-it / README.md

itsrocchi

Update README.md

a64a1ca over 2 years ago

2.86 kB

license: llama2
datasets:
  - itsrocchi/seeweb-it-292-forLLM
language:
  - it

Model Card for itsrocchi/SeewebLLM-it-ver2

The model is a fine-tuned version of LLama-2-7b-chat-hf specialized into italian speaking.

Backbone Model: LLama2
Language(s) : Italian
Finetuned from model: LLama-2-7b-chat-hf

Bias, Risks, and Limitations

Due to a lack of training the model may not produce 100% correct output sentences.

Training script

The following repository contains scripts and instructions used for the finetuning and testing:

https://github.com/itsrocchi/finetuning-llama2-ita.git

Inference

here's a little python snippet to perform inference

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

tokenizer = AutoTokenizer.from_pretrained("itsrocchi/SeewebLLM-it-ver2")
model = AutoModelForCausalLM.from_pretrained(
    "itsrocchi/SeewebLLM-it-ver2",
    device_map="auto",
    torch_dtype=torch.float16,
    load_in_8bit=True,
    rope_scaling={"type": "dynamic", "factor": 2} 
)

# eventualmente si possono modificare i parametri di model e tokenizer 
# inserendo il percorso assoluto della directory locale del modello

prompt = "### User:\nDescrivi cos' è l'intelligenza artificiale\n\n### Assistant:\n" 
#modificare ciò che è scritto tra "User" ed "assistant per personalizzare il prompt" 
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

output = model.generate(**inputs, streamer=streamer, use_cache=True, max_new_tokens=float('inf'))
output_text = tokenizer.decode(output[0], skip_special_tokens=True)

Training Data and Details

The dataset used is itsrocchi/seeweb-it-292-forLLM, a dataset containing approx. 300 italian prompt-answer conversations.

The training has been made on RTX A6000, inside Seeweb's Cloud Server GPU