Model Card: Llama-2-chat-finetuned
Model Details
- Model Name: Llama-2-chat-finetuned
- Base Model: NousResearch/Llama-2-7b-chat-hf
- Fine-Tuned By: HiTruong
- Fine-Tuning Method: LoRA (Low-Rank Adaptation)
- Dataset: Movie-related dataset
- Evaluation Metric: BLEU Score
- BLEU Score Before Fine-Tuning: 33.26
- BLEU Score After Fine-Tuning: 77.53
Model Description
This model is a fine-tuned version of NousResearch/Llama-2-7b-chat-hf, optimized for movie-related conversations. The fine-tuning process was performed using LoRA to efficiently adapt the model while keeping computational requirements manageable. It is designed to improve conversational understanding and response generation for movie-related queries.
Training Details
- Hardware Used: Kaggle GPU (T4x2)
- Fine-Tuning Framework: Hugging Face Transformers + LoRA
- Output Folder:
./results - Number of Epochs: 2
- Batch Size:
- Per Device Train:
4 - Per Device Eval:
4
- Per Device Train:
- Gradient Accumulation Steps:
1 - Gradient Checkpointing: Enabled
- Max Gradient Norm:
0.3 - Mixed Precision:
fp16=False,bf16=False - Optimizer:
paged_adamw_32bit - Learning Rate:
2e-5 - Weight Decay:
0.001 - LR Scheduler Type:
cosine - Warmup Ratio:
0.03 - Max Steps:
-1(determined by epochs) - Quantization Settings:
use_4bit = Truebnb_4bit_compute_dtype = float16bnb_4bit_quant_type = nf4use_nested_quant = False
- LoRA Hyperparameters:
lora_r = 64lora_alpha = 16lora_dropout = 0.05
- Sequence Length: Dynamic (
max_seq_length=None) - Packing: Disabled (
packing=False) - Device Map:
{"": 0}
Capabilities
- Answers movie-related questions with improved accuracy.
- Understands movie genres, actors, directors, and plots.
- Provides recommendations based on user preferences.
Limitations
- May generate incorrect or biased information.
- Limited to the knowledge present in the training dataset.
- Does not have real-time access to new movie releases.
Usage
You can load and use the model with the following code:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "HiTruong/Llama-2-chat-finetuned"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
def generate_answer(question):
inputs = tokenizer(f"<s>[INST] {question} [/INST]", return_tensors="pt", truncation=True, max_length=100).to(model.device)
with torch.no_grad():
output = model.generate(**inputs, max_length=75, eos_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(output[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
return response.replace(f"[INST] {question} [/INST]", "").strip().split('.')[0]
input_text = "What are some great sci-fi movies?"
print(generate_answer(input_text))
- Downloads last month
- 6
Model tree for HiTruong/Llama-2-chat-finetuned
Base model
NousResearch/Llama-2-7b-chat-hf