Nonotza Zero
Collection
1 item • Updated
Este modelo es parte de nuestro esfuerzo por dar acceso a la IA en Español a todas las personas y no sólo a aquellos con un celular de gama alta o una computadora poderosa.
Creado usando gpt-neo-125m por EleutherAI como modelo base y el conjunto de datos conversacional del proyeto OpenAssistant (oasst1) para adaptarlo al formato de instrucciones.
-This model is part of an effort to make AI accessible in Spanish to everyone, not just those with high-end smartphones or powerful computers.-
Text generation.
Use the code below to get started with the model.
from transformers import AutoModelForCausalLM, AutoConfig, AutoTokenizer
import torch
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model_name = "Aconoya/Nono_instruct_neo-125m_dpo"
model = AutoModelForCausalLM.from_pretrained(model_name)
model.gradient_checkpointing_enable()
model = model.to(device.type)
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token
ender_string = '<endofturn>'
system_string='<system>'
user_string='<user>'
assistant_string='<assistant>'
prompt = ['Hello! How are you?', '¡Hola!, ¿Cómo estás?', '¿Qué es un perro?', 'What is a dog?']
prompt = choice(prompt)
formatted_prompt = system_string + 'You are a digital assistant.' + ender_string + '\n' + user_string + prompt + ender_string + '\n' + assistant_string
model_input = tokenizer.encode(formatted_prompt, return_tensors='pt').to(device)
generated_ids = model.generate(input_ids=model_input, pad_token_id=tokenizer.eos_token_id, max_new_tokens=50)
generated_text = tokenizer.decode(generated_ids[:, model_input.shape[-1]:][0], skip_special_tokens=True)
print('Prompt:', prompt)
print("Response: '{}'".format(generated_text))
The model was trained using the conversational dataset from the OpenAssistant project (oasst1).
The model was trained using Kaggle.com free services.
Base model
EleutherAI/gpt-neo-125m