| | --- |
| | license: gpl-3.0 |
| | tags: |
| | - conversational |
| | - gpt2 |
| | language: |
| | - es |
| | datasets: |
| | - open_subtitles |
| | widget: |
| | - text: Me gusta el deporte |
| | example_title: Interacción |
| | - text: Hola |
| | example_title: Saludo |
| | - text: ¿Como estas? |
| | example_title: Pregunta |
| |
|
| | --- |
| | |
| | # Spanish GPT-2 as backbone |
| |
|
| | Fine-tuned model on Spanish language using [Opensubtitle](https://opus.nlpl.eu/OpenSubtitles-v2018.php) dataset. The original GPT-2 |
| | model was used as backbone which has been trained from scratch on the Spanish portion of OSCAR dataset, according to the [Flax/Jax](https://huggingface.co/flax-community/gpt-2-spanish) |
| | Community by HuggingFace. |
| |
|
| | ## Model description and fine tunning |
| |
|
| | First, the model used as backbone was the OpenAI's GPT-2, introduced in the paper "Language Models are Unsupervised Multitask Learners" |
| | by Alec Radford et al. Second, transfer learning approach with a large dataset in Spanish was used to transform the text generation model to |
| | conversational tasks. The use of special tokens plays a key role in the process of fine-tuning. |
| |
|
| | ```python |
| | tokenizer.add_special_tokens({"pad_token": "<pad>", |
| | "bos_token": "<startofstring>", |
| | "eos_token": "<endofstring>"}) |
| | tokenizer.add_tokens(["<bot>:"]) |
| | ``` |
| |
|
| | ## How to use |
| |
|
| | You can use this model directly with a pipeline for auto model with casual LM: |
| |
|
| | ```python |
| | import torch |
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| | |
| | tokenizer = AutoTokenizer.from_pretrained("erikycd/chatbot_hadita") |
| | model = AutoModelForCausalLM.from_pretrained("erikycd/chatbot_hadita") |
| | device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu" |
| | model = model.to(device) |
| | |
| | def infer(inp): |
| | inp = "<startofstring> "+ inp +" <bot>: " |
| | inp = tokenizer(inp, return_tensors = "pt") |
| | X = inp["input_ids"].to(device) |
| | attn = inp["attention_mask"].to(device) |
| | output = model.generate(X, attention_mask = attn, pad_token_id = tokenizer.eos_token_id) |
| | output = tokenizer.decode(output[0], skip_special_tokens = True) |
| | return output |
| | |
| | exit_commands = ('bye', 'quit') |
| | text = '' |
| | while text not in exit_commands: |
| | |
| | text = input('\nUser: ') |
| | output = infer(text) |
| | print('Bot: ', output) |
| | |
| | ``` |