IrfanHamid
/

ChatBot-lora-7b

character-chatbot

Model card Files Files and versions

ChatBot-lora-7b / README.md

IrfanHamid's picture

Update README.md

40e3374 verified 10 months ago

|

history blame contribute delete

2.15 kB

	---
	library_name: transformers
	tags: [llama2, peft, character-chatbot, gradio, 4bit]
	---

	# Model Card for Model ID

	# LLM Character-Based Chatbot (LoRA Fine-Tuned)

	This model fine-tunes Meta's `LLaMA-2-7b-chat-hf` using PEFT and LoRA to create a character-based chatbot that mimics the style and personality of a fictional character. It has been trained on question-answering dataset structured in a conversational format.

	---

	## Model Details

	- Base Model: `meta-llama/Llama-2-7b-chat-hf`
	- Fine-Tuned Using: LoRA via PEFT
	- Quantization: 4-bit (using bitsandbytes)
	- Language: English
	- Tokenizer: Same as base model
	- Intended Use: Educational and personal projects
	- License: This model is fine-tuned from Meta’s LLaMA-2-7b-chat-hf, which is released under the LLaMA 2 Community License. This fine-tuned version is intended for non-commercial, educational use only.

	---

	## How to Use

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel
	import torch

	# Load base + LoRA fine-tuned model
	base_model = AutoModelForCausalLM.from_pretrained(
	"meta-llama/Llama-2-7b-chat-hf",
	device_map="auto",
	torch_dtype=torch.float16,
	load_in_4bit=True
	)

	model = PeftModel.from_pretrained(base_model, "IrfanHamid/ChatBot-lora-7b")
	tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf")

	# Generate response
	messages = [
	{"role": "system", "content": "You are Spider-Man from the Marvel universe. Speak like Peter Parker — witty, responsible, and full of heart. Always respond in character."},
	{"role": "user", "content": "What's your biggest fear?"}
	]
	prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

	inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=150,
	do_sample=True,
	top_p=0.9,
	temperature=0.8,
	pad_token_id=tokenizer.eos_token_id
	)

	print(tokenizer.decode(outputs[0][inputs['input_ids'].shape[-1]:], skip_special_tokens=True).strip())