Update README.md

b8aba21 verified 11 months ago

2.2 kB

base_model: meta-llama/Meta-Llama-3-8B-Instruct
library_name: peft
license: fair-noncommercial-research-license
datasets:
  - yahma/alpaca-cleaned
extra_gated_fields:
  First Name: text
  Last Name: text
  Date of birth: date_picker
  Country: country
  Affiliation: text
  I accept the terms and conditions: checkbox
  geo: ip_location
language:
  - en
tags:
  - facebook
  - meta
  - pytorch
  - llama
  - llama-3

TamedLlama-8B-Instruct

Repository for TamedLlama-8B-Instruct, a fine-tuned variant of Llama-3-8B-Instruct that is robust against prompt injection attacks. See our TamedLlama paper for more information.

Utility Evaluation (higher is better)

Category	Benchmark	Metric	Llama 3 8B Instruct	TamedLlama 8B Instruct	GPT-4o-mini	GPT-4o (2024-11-20)
General Knowledge	MMLU (0-shot, CoT)	macro_avg/acc	64.6	61.2	82.0^[1]	85.7^[2]
	MMLU Pro (5-shot, CoT)	macro_avg/acc	42.5	40.7	63.1^[3]	77.9^[3]
	IFEval		76.3	74.1	-	-
	BBH (3-shot, CoT)	acc	68.4	64.6	-	-
	GPQA (0-shot, CoT)	acc	35.3	32.6	40.2^[1]	46.0^[2]
Instruction Following	AlpacaEval2	win_rate	28.0	26.5	44.7	56.2
	SEP	win_rate	50.0	48.5	65.9	64.9

Security Evaluation (lower is better)

Category	Benchmark	Metric	Llama 3 8B Instruct	TamedLlama 8B Instruct	GPT-4o-mini	GPT-4o (2024-11-20)
Instruction Following	AlpacaFarm	ASR	23.1	0.0	0.5	0.0
	SEP (start)	ASR	48.7	5.9	14.6	14.8
	SEP (end)	ASR	60.0	6.8	9.1	14.4
	TaskTracker	ASR	5.3	0.2	0.3	0.6
	CyberSecEval2	ASR	43.6	7.3	25.5	20.0