--- library_name: transformers tags: [llama2, peft, character-chatbot, gradio, 4bit] --- # Model Card for Model ID # LLM Character-Based Chatbot (LoRA Fine-Tuned) This model fine-tunes Meta's `LLaMA-2-7b-chat-hf` using PEFT and LoRA to create a **character-based chatbot** that mimics the style and personality of a fictional character. It has been trained on question-answering dataset structured in a conversational format. --- ## Model Details - **Base Model:** `meta-llama/Llama-2-7b-chat-hf` - **Fine-Tuned Using:** LoRA via PEFT - **Quantization:** 4-bit (using bitsandbytes) - **Language:** English - **Tokenizer:** Same as base model - **Intended Use:** Educational and personal projects - **License:** This model is fine-tuned from Meta’s LLaMA-2-7b-chat-hf, which is released under the LLaMA 2 Community License. This fine-tuned version is intended for non-commercial, educational use only. --- ## How to Use ```python from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel import torch # Load base + LoRA fine-tuned model base_model = AutoModelForCausalLM.from_pretrained( "meta-llama/Llama-2-7b-chat-hf", device_map="auto", torch_dtype=torch.float16, load_in_4bit=True ) model = PeftModel.from_pretrained(base_model, "IrfanHamid/ChatBot-lora-7b") tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf") # Generate response messages = [ {"role": "system", "content": "You are Spider-Man from the Marvel universe. Speak like Peter Parker — witty, responsible, and full of heart. Always respond in character."}, {"role": "user", "content": "What's your biggest fear?"} ] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(prompt, return_tensors="pt").to("cuda") with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=150, do_sample=True, top_p=0.9, temperature=0.8, pad_token_id=tokenizer.eos_token_id ) print(tokenizer.decode(outputs[0][inputs['input_ids'].shape[-1]:], skip_special_tokens=True).strip())