Llama 3.1 8B Instruct β€” HateXplain (LoRA adapter, ckpt-3500)

  • Base model: meta-llama/Meta-Llama-3.1-8B-Instruct
  • Adapter: LoRA (rank=8, alpha=16, dropout=0.05, target=all)
  • Trainable params: ~0.26% (β‰ˆ20.97M / 8.05B)
  • Task: hate speech detection (3 classes: hatespeech, offensive, normal)
  • Dataset: HateXplain (train/validation)
  • Epochs: 3

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base = "meta-llama/Meta-Llama-3.1-8B-Instruct"
adapter = "<your-username>/<your-adapter-repo>"  # or local path to this checkpoint

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)

# Build a llama3-style prompt and generate the label ("hatespeech"/"offensive"/"normal")

Training Configuration

  • Finetuning type: LoRA
  • LoRA: rank=8, alpha=16, dropout=0.05, target=all
  • Precision: bf16 (with gradient checkpointing)
  • Per-device batch size: 1
  • Gradient accumulation: 8 (effective batch = 8)
  • Learning rate: 5e-5, scheduler: cosine, warmup ratio: 0.05
  • Epochs: 3
  • Template: llama3, cutoff length: 2048
  • Output dir: saves/llama31-8b/hatexplain/lora

Data

  • Source: HateXplain (datasets), majority-vote label from annotators
  • Splits: train (15,383), validation (1,922)
  • Format: Alpaca-style with instruction + input β†’ label as assistant response

Evaluation (validation)

Overall:

  • Accuracy: 0.7196
  • Macro-F1: 0.6998

Per-class report:

              precision    recall  f1-score   support

  hatespeech     0.7252    0.8634    0.7883       593
   offensive     0.6152    0.4726    0.5346       548
      normal     0.7698    0.7836    0.7766       781

    accuracy                         0.7196      1922
   macro avg     0.7034    0.7065    0.6998      1922
weighted avg     0.7120    0.7196    0.7112      1922

Notes:

  • Decoding is greedy with short label strings; loss-based and log-likelihood scoring produce similar ordering.
  • Ensure tokenizer uses pad_token = eos_token for batched evaluation.

Limitations and Intended Use

  • For moderated content classification; not intended for generating harmful content.
  • Biases present in the dataset may be reflected in predictions.

Acknowledgements

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 1 Ask for provider support

Model tree for muditbaid/llama31-8b-hatexplain-lora

Adapter
(1513)
this model

Evaluation results