metadata
base_model: meta-llama/Meta-Llama-3-8B-Instruct
library_name: peft
license: fair-noncommercial-research-license
datasets:
- yahma/alpaca-cleaned
extra_gated_fields:
First Name: text
Last Name: text
Date of birth: date_picker
Country: country
Affiliation: text
I accept the terms and conditions: checkbox
geo: ip_location
language:
- en
tags:
- facebook
- meta
- pytorch
- llama
- llama-3
TamedLlama-8B-Instruct
Repository for TamedLlama-8B-Instruct, a fine-tuned variant of Llama-3-8B-Instruct that is robust against prompt injection attacks. See our TamedLlama paper for more information.
Utility Evaluation (higher is better)
| Category | Benchmark | Metric | Llama 3 8B Instruct | TamedLlama 8B Instruct | GPT-4o-mini | GPT-4o (2024-11-20) |
|---|---|---|---|---|---|---|
| General Knowledge | MMLU (0-shot, CoT) | macro_avg/acc | 64.6 | 61.2 | 82.0[1] | 85.7[2] |
| MMLU Pro (5-shot, CoT) | macro_avg/acc | 42.5 | 40.7 | 63.1[3] | 77.9[3] | |
| IFEval | 76.3 | 74.1 | - | - | ||
| BBH (3-shot, CoT) | acc | 68.4 | 64.6 | - | - | |
| GPQA (0-shot, CoT) | acc | 35.3 | 32.6 | 40.2[1] | 46.0[2] | |
| Instruction Following | AlpacaEval2 | win_rate | 28.0 | 26.5 | 44.7 | 56.2 |
| SEP | win_rate | 50.0 | 48.5 | 65.9 | 64.9 |
Security Evaluation (lower is better)
| Category | Benchmark | Metric | Llama 3 8B Instruct | TamedLlama 8B Instruct | GPT-4o-mini | GPT-4o (2024-11-20) |
|---|---|---|---|---|---|---|
| Instruction Following | AlpacaFarm | ASR | 23.1 | 0.0 | 0.5 | 0.0 |
| SEP (start) | ASR | 48.7 | 5.9 | 14.6 | 14.8 | |
| SEP (end) | ASR | 60.0 | 6.8 | 9.1 | 14.4 | |
| TaskTracker | ASR | 5.3 | 0.2 | 0.3 | 0.6 | |
| CyberSecEval2 | ASR | 43.6 | 7.3 | 25.5 | 20.0 |