| | --- |
| | base_model: meta-llama/Meta-Llama-3.1-8B-Instruct |
| | library_name: peft |
| | pipeline_tag: text-generation |
| | tags: |
| | - llama-factory |
| | - lora |
| | - transformers |
| | model-index: |
| | - name: llama31-8b-hatexplain-lora (checkpoint-3500) |
| | results: |
| | - task: |
| | type: text-classification |
| | name: Hate speech classification |
| | dataset: |
| | name: HateXplain |
| | type: hatexplain |
| | split: validation |
| | metrics: |
| | - type: accuracy |
| | value: 0.7196 |
| | - type: f1 |
| | name: macro_f1 |
| | value: 0.6998 |
| | --- |
| | |
| | # Llama 3.1 8B Instruct — HateXplain (LoRA adapter, ckpt-3500) |
| |
|
| | - Base model: `meta-llama/Meta-Llama-3.1-8B-Instruct` |
| | - Adapter: LoRA (rank=8, alpha=16, dropout=0.05, target=all) |
| | - Trainable params: ~0.26% (≈20.97M / 8.05B) |
| | - Task: hate speech detection (3 classes: `hatespeech`, `offensive`, `normal`) |
| | - Dataset: HateXplain (train/validation) |
| | - Epochs: 3 |
| |
|
| | ## Usage |
| |
|
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| | from peft import PeftModel |
| | import torch |
| | |
| | base = "meta-llama/Meta-Llama-3.1-8B-Instruct" |
| | adapter = "<your-username>/<your-adapter-repo>" # or local path to this checkpoint |
| | |
| | tokenizer = AutoTokenizer.from_pretrained(base) |
| | model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.bfloat16, device_map="auto") |
| | model = PeftModel.from_pretrained(model, adapter) |
| | |
| | # Build a llama3-style prompt and generate the label ("hatespeech"/"offensive"/"normal") |
| | ``` |
| |
|
| | ## Training Configuration |
| |
|
| | - Finetuning type: LoRA |
| | - LoRA: rank=8, alpha=16, dropout=0.05, target=all |
| | - Precision: bf16 (with gradient checkpointing) |
| | - Per-device batch size: 1 |
| | - Gradient accumulation: 8 (effective batch = 8) |
| | - Learning rate: 5e-5, scheduler: cosine, warmup ratio: 0.05 |
| | - Epochs: 3 |
| | - Template: `llama3`, cutoff length: 2048 |
| | - Output dir: `saves/llama31-8b/hatexplain/lora` |
| |
|
| | ## Data |
| |
|
| | - Source: HateXplain (`datasets`), majority-vote label from annotators |
| | - Splits: train (15,383), validation (1,922) |
| | - Format: Alpaca-style with `instruction` + `input` → label as assistant response |
| |
|
| | ## Evaluation (validation) |
| |
|
| | Overall: |
| | - Accuracy: 0.7196 |
| | - Macro-F1: 0.6998 |
| |
|
| | Per-class report: |
| |
|
| | ``` |
| | precision recall f1-score support |
| | |
| | hatespeech 0.7252 0.8634 0.7883 593 |
| | offensive 0.6152 0.4726 0.5346 548 |
| | normal 0.7698 0.7836 0.7766 781 |
| | |
| | accuracy 0.7196 1922 |
| | macro avg 0.7034 0.7065 0.6998 1922 |
| | weighted avg 0.7120 0.7196 0.7112 1922 |
| | ``` |
| |
|
| | Notes: |
| | - Decoding is greedy with short label strings; loss-based and log-likelihood scoring produce similar ordering. |
| | - Ensure tokenizer uses `pad_token = eos_token` for batched evaluation. |
| |
|
| | ## Limitations and Intended Use |
| |
|
| | - For moderated content classification; not intended for generating harmful content. |
| | - Biases present in the dataset may be reflected in predictions. |
| |
|
| | ## Acknowledgements |
| |
|
| | - Built with [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) and PEFT. |
| |
|
| |
|