--- license: mit language: - en library_name: transformers datasets: - Cynaptics/persona-chat --- # DialoGPT-Chat-Finetune This is a fine-tuned version of the DialoGPT model. It has been fine-tuned on persona-based data to generate human-like conversational responses. ## Model Description The model is based on the DialoGPT architecture and has been fine-tuned for conversational tasks, specifically targeting persona-based interactions. ## Model Details - **Architecture:** DialoGPT-medium - **Pretraining Data:** The original model was pretrained on a large corpus of text data. - **Fine-tuning Data:** This model was fine-tuned on persona-based conversational data. ## Library - Framework: PyTorch - Model: DialoGPT-medium ## Example Usage You can use this model via the Hugging Face `transformers` library. To use the fine-tuned model for text generation based on a persona, follow these steps: ```python from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer # Load the fine-tuned model and tokenizer model_name = "hello12w/persona_chatbot" model = AutoModelForCausalLM.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name) # Define the persona and prompt prompt = prompt = f""" Person B has the following Persona information. Persona of Person B: My name is Sarah and I'm a 28 year old software engineer. Persona of Person B: I love coding and developing new software applications. Persona of Person B: In my free time, I enjoy reading sci-fi novels and playing board games. Instruct: Person A and Person B are now having a conversation. Following the conversation below, write a response that Person B would say based on the above Persona information. Please carefully consider the flow and context of the conversation below, and use the Person B's Persona information appropriately to generate a response that you think is the most appropriate reply for Person B. Persona A: Hi Sarah, I heard you're working on a cool project at work. Can you tell me more about it? Output: """ input_ids = tokenizer(prompt, return_tensors="pt", truncation=True) attention_mask = input_ids.attention_mask input_ids = input_ids.input_ids # Inference with torch.no_grad(): outputs = model.generate( input_ids=input_ids, attention_mask=attention_mask, max_new_tokens=200, do_sample=True, top_p=0.95, temperature=0.9 ) # Decode output tokens decoded_outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True) output = decoded_outputs[0][len(prompt):] print(output)