metadata
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
library_name: peft
pipeline_tag: text-generation
tags:
- llama-factory
- lora
- transformers
model-index:
- name: llama31-8b-hatexplain-lora (checkpoint-3500)
results:
- task:
type: text-classification
name: Hate speech classification
dataset:
name: HateXplain
type: hatexplain
split: validation
metrics:
- type: accuracy
value: 0.7196
- type: f1
name: macro_f1
value: 0.6998
Llama 3.1 8B Instruct — HateXplain (LoRA adapter, ckpt-3500)
- Base model:
meta-llama/Meta-Llama-3.1-8B-Instruct - Adapter: LoRA (rank=8, alpha=16, dropout=0.05, target=all)
- Trainable params: ~0.26% (≈20.97M / 8.05B)
- Task: hate speech detection (3 classes:
hatespeech,offensive,normal) - Dataset: HateXplain (train/validation)
- Epochs: 3
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
base = "meta-llama/Meta-Llama-3.1-8B-Instruct"
adapter = "<your-username>/<your-adapter-repo>" # or local path to this checkpoint
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)
# Build a llama3-style prompt and generate the label ("hatespeech"/"offensive"/"normal")
Training Configuration
- Finetuning type: LoRA
- LoRA: rank=8, alpha=16, dropout=0.05, target=all
- Precision: bf16 (with gradient checkpointing)
- Per-device batch size: 1
- Gradient accumulation: 8 (effective batch = 8)
- Learning rate: 5e-5, scheduler: cosine, warmup ratio: 0.05
- Epochs: 3
- Template:
llama3, cutoff length: 2048 - Output dir:
saves/llama31-8b/hatexplain/lora
Data
- Source: HateXplain (
datasets), majority-vote label from annotators - Splits: train (15,383), validation (1,922)
- Format: Alpaca-style with
instruction+input→ label as assistant response
Evaluation (validation)
Overall:
- Accuracy: 0.7196
- Macro-F1: 0.6998
Per-class report:
precision recall f1-score support
hatespeech 0.7252 0.8634 0.7883 593
offensive 0.6152 0.4726 0.5346 548
normal 0.7698 0.7836 0.7766 781
accuracy 0.7196 1922
macro avg 0.7034 0.7065 0.6998 1922
weighted avg 0.7120 0.7196 0.7112 1922
Notes:
- Decoding is greedy with short label strings; loss-based and log-likelihood scoring produce similar ordering.
- Ensure tokenizer uses
pad_token = eos_tokenfor batched evaluation.
Limitations and Intended Use
- For moderated content classification; not intended for generating harmful content.
- Biases present in the dataset may be reflected in predictions.
Acknowledgements
- Built with LLaMA-Factory and PEFT.