--- base_model: meta-llama/Meta-Llama-3.1-8B-Instruct library_name: peft pipeline_tag: text-generation tags: - llama-factory - lora - transformers model-index: - name: llama31-8b-hatexplain-lora (checkpoint-3500) results: - task: type: text-classification name: Hate speech classification dataset: name: HateXplain type: hatexplain split: validation metrics: - type: accuracy value: 0.7196 - type: f1 name: macro_f1 value: 0.6998 --- # Llama 3.1 8B Instruct — HateXplain (LoRA adapter, ckpt-3500) - Base model: `meta-llama/Meta-Llama-3.1-8B-Instruct` - Adapter: LoRA (rank=8, alpha=16, dropout=0.05, target=all) - Trainable params: ~0.26% (≈20.97M / 8.05B) - Task: hate speech detection (3 classes: `hatespeech`, `offensive`, `normal`) - Dataset: HateXplain (train/validation) - Epochs: 3 ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel import torch base = "meta-llama/Meta-Llama-3.1-8B-Instruct" adapter = "/" # or local path to this checkpoint tokenizer = AutoTokenizer.from_pretrained(base) model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.bfloat16, device_map="auto") model = PeftModel.from_pretrained(model, adapter) # Build a llama3-style prompt and generate the label ("hatespeech"/"offensive"/"normal") ``` ## Training Configuration - Finetuning type: LoRA - LoRA: rank=8, alpha=16, dropout=0.05, target=all - Precision: bf16 (with gradient checkpointing) - Per-device batch size: 1 - Gradient accumulation: 8 (effective batch = 8) - Learning rate: 5e-5, scheduler: cosine, warmup ratio: 0.05 - Epochs: 3 - Template: `llama3`, cutoff length: 2048 - Output dir: `saves/llama31-8b/hatexplain/lora` ## Data - Source: HateXplain (`datasets`), majority-vote label from annotators - Splits: train (15,383), validation (1,922) - Format: Alpaca-style with `instruction` + `input` → label as assistant response ## Evaluation (validation) Overall: - Accuracy: 0.7196 - Macro-F1: 0.6998 Per-class report: ``` precision recall f1-score support hatespeech 0.7252 0.8634 0.7883 593 offensive 0.6152 0.4726 0.5346 548 normal 0.7698 0.7836 0.7766 781 accuracy 0.7196 1922 macro avg 0.7034 0.7065 0.6998 1922 weighted avg 0.7120 0.7196 0.7112 1922 ``` Notes: - Decoding is greedy with short label strings; loss-based and log-likelihood scoring produce similar ordering. - Ensure tokenizer uses `pad_token = eos_token` for batched evaluation. ## Limitations and Intended Use - For moderated content classification; not intended for generating harmful content. - Biases present in the dataset may be reflected in predictions. ## Acknowledgements - Built with [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) and PEFT.