muditbaid's picture
Upload README.md with huggingface_hub
5aca777 verified
---
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
library_name: peft
pipeline_tag: text-generation
tags:
- llama-factory
- lora
- transformers
model-index:
- name: llama31-8b-hatexplain-lora (checkpoint-3500)
results:
- task:
type: text-classification
name: Hate speech classification
dataset:
name: HateXplain
type: hatexplain
split: validation
metrics:
- type: accuracy
value: 0.7196
- type: f1
name: macro_f1
value: 0.6998
---
# Llama 3.1 8B Instruct — HateXplain (LoRA adapter, ckpt-3500)
- Base model: `meta-llama/Meta-Llama-3.1-8B-Instruct`
- Adapter: LoRA (rank=8, alpha=16, dropout=0.05, target=all)
- Trainable params: ~0.26% (≈20.97M / 8.05B)
- Task: hate speech detection (3 classes: `hatespeech`, `offensive`, `normal`)
- Dataset: HateXplain (train/validation)
- Epochs: 3
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
base = "meta-llama/Meta-Llama-3.1-8B-Instruct"
adapter = "<your-username>/<your-adapter-repo>" # or local path to this checkpoint
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)
# Build a llama3-style prompt and generate the label ("hatespeech"/"offensive"/"normal")
```
## Training Configuration
- Finetuning type: LoRA
- LoRA: rank=8, alpha=16, dropout=0.05, target=all
- Precision: bf16 (with gradient checkpointing)
- Per-device batch size: 1
- Gradient accumulation: 8 (effective batch = 8)
- Learning rate: 5e-5, scheduler: cosine, warmup ratio: 0.05
- Epochs: 3
- Template: `llama3`, cutoff length: 2048
- Output dir: `saves/llama31-8b/hatexplain/lora`
## Data
- Source: HateXplain (`datasets`), majority-vote label from annotators
- Splits: train (15,383), validation (1,922)
- Format: Alpaca-style with `instruction` + `input` → label as assistant response
## Evaluation (validation)
Overall:
- Accuracy: 0.7196
- Macro-F1: 0.6998
Per-class report:
```
precision recall f1-score support
hatespeech 0.7252 0.8634 0.7883 593
offensive 0.6152 0.4726 0.5346 548
normal 0.7698 0.7836 0.7766 781
accuracy 0.7196 1922
macro avg 0.7034 0.7065 0.6998 1922
weighted avg 0.7120 0.7196 0.7112 1922
```
Notes:
- Decoding is greedy with short label strings; loss-based and log-likelihood scoring produce similar ordering.
- Ensure tokenizer uses `pad_token = eos_token` for batched evaluation.
## Limitations and Intended Use
- For moderated content classification; not intended for generating harmful content.
- Biases present in the dataset may be reflected in predictions.
## Acknowledgements
- Built with [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) and PEFT.