Llama 3.1 Jigsaw Threat QLoRA Adapter

This repository packages a QLoRA adapter built with LLaMA-Factory for classifying threat vs non-threat comments from the Jigsaw Toxic Comment Classification Challenge.
It fine-tunes meta-llama/Meta-Llama-3.1-8B-Instruct with 4-bit quantization and rank-8 LoRA layers, so you only need to download ~150 MB of adapter weights instead of the entire 8B checkpoint.

Dataset and Prompting

  • Splits: 3 500 training examples plus 1 334 validation and 1 334 test examples converted into Alpaca-style JSONL (data/jigsaw_threat_*.jsonl).
  • Instruction: You are a helpful Assistant. Your task is to classify the social media post as threat or not threat. Post:
  • System prompt: Strictly respond only with the label: 'threat' or 'not threat'.
  • Labels: threat, not threat. The evaluation script lowercases predictions before scoring.

Training Configuration

Component Value
Base model meta-llama/Meta-Llama-3.1-8B-Instruct
Finetuning method QLoRA (bnb 4-bit, lora_rank=8, lora_alpha=16, lora_dropout=0.05)
Sequence length 1 024
Batch size 4 per device × grad acc 8 (effective 32)
Optimizer paged_adamw_32bit, LR 2e-5, cosine decay, warmup 5 %
Epochs 3
Frameworks transformers==4.57.1, peft==0.17.1, bitsandbytes==0.43.1, PyTorch 2.1+

All raw logs (trainer_log.jsonl, trainer_state.json, train_results.json, all_results.json) and eval outputs live in this repository for reproducibility.

Evaluation

Evaluation used scripts/eval_jigsaw_threat_metrics.py (greedy decoding, max 4 new tokens, cutoff 1 024):

Split Accuracy Macro-F1 Precision (threat / not threat) Recall (threat / not threat)
Validation 0.9633 0.9633 0.9640 / 0.9626 0.9625 / 0.9640
Test 0.9588 0.9588 0.9622 / 0.9554 0.9550 / 0.9625

See eval_metrics.txt for the full sklearn classification reports and timing.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base_model = "meta-llama/Meta-Llama-3.1-8B-Instruct"
adapter_id = "muditbaid/llama31-jigsaw-threat-qlora"

tokenizer = AutoTokenizer.from_pretrained(adapter_id, use_fast=True, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto", trust_remote_code=True)
model = PeftModel.from_pretrained(model, adapter_id)

system_prompt = "Strictly respond only with the label: 'threat' or 'not threat'."
instruction = "You are a helpful Assistant. Your task is to classify the social media post as threat or not threat. Post:"
post = "if you show up at the rally, we will make sure you never walk again."

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": f"{instruction} {post}"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=4, do_sample=False)
prediction = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True).strip()
print(prediction)

Repository Contents

  • adapter_model.safetensors, adapter_config.json: LoRA weights + config.
  • tokenizer.json, tokenizer_config.json, special_tokens_map.json, chat_template.jinja: tokenizer assets aligned with Llama 3.1.
  • eval_metrics.txt, eval_results.json: detailed evaluation logs.
  • trainer_state.json, trainer_log.jsonl, train_results.json, all_results.json: training summaries.

Intermediate checkpoints (e.g., checkpoint-330, checkpoint-471) are retained locally but excluded from the upload for size reasons.

Responsible AI

  • The dataset contains violent or threatening text. Provide content warnings and restrict access to appropriate moderation workflows.
  • Outputs are limited to the labels threat / not threat; the model does not explain its decisions.
  • Meta's Llama 3.1 community license applies—please ensure compliance before deployment.

Citation

If you use this adapter, please cite both Jigsaw and Llama 3.1:

@misc{jigsaw2018threat,
  title={Jigsaw Toxic Comment Classification Challenge},
  howpublished={Kaggle},
  year={2018}
}
@misc{AI@Meta2024Llama3,
  author = {AI@Meta},
  title = {The Llama 3 Herd of Models},
  year = {2024}
}
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for muditbaid/llama31-jigsaw-threat-qlora

Adapter
(1970)
this model

Evaluation results

  • Accuracy on Jigsaw Toxic Comments (threat subset) validation
    self-reported
    0.963
  • Macro-F1 on Jigsaw Toxic Comments (threat subset) validation
    self-reported
    0.963
  • Accuracy on Jigsaw Toxic Comments (threat subset) test
    self-reported
    0.959
  • Macro-F1 on Jigsaw Toxic Comments (threat subset) test
    self-reported
    0.959