NoahShen
/

BAIT-ModelZoo

Model card Files Files and versions

BAIT-ModelZoo / README.md

NoahShen's picture

update README.md

35f1051 verified 11 months ago

|

history blame contribute delete

1.6 kB

	---
	license: apache-2.0
	datasets:
	- tatsu-lab/alpaca
	- yizhongw/self_instruct
	language:
	- en
	base_model:
	- meta-llama/Llama-2-7b-hf
	- meta-llama/Llama-3.1-8B-Instruct
	- mistralai/Mistral-7B-Instruct-v0.2
	---

	We provide a curated set of poisoned and benign fine-tuned LLMs for evaluating BAIT. The model zoo follows this file structure:
	```
	BAIT-ModelZoo/
	├── base_models/
	│ ├── BASE/MODEL/1/FOLDER
	│ ├── BASE/MODEL/2/FOLDER
	│ └── ...
	├── models/
	│ ├── id-0001/
	│ │ ├── model/
	│ │ │ └── ...
	│ │ └── config.json
	│ ├── id-0002/
	│ └── ...
	└── METADATA.csv
	```
	```base_models``` stores pretrained LLMs downloaded from Huggingface. We evaluate BAIT on the following 3 LLM architectures:

	- [Llama-2-7B-chat-hf](meta-llama/Llama-2-7b-chat-hf)
	- [Llama-3-8B-Instruct](meta-llama/Meta-Llama-3-8B-Instruct)
	- [Mistral-7B-Instruct-v0.2](mistralai/Mistral-7B-Instruct-v0.2)

	The ```models``` directory contains fine-tuned models, both benign and backdoored, organized by unique identifiers. Each model folder includes:

	- The model files
	- A ```config.json``` file with metadata about the model, including:
	- Fine-tuning hyperparameters
	- Fine-tuning dataset
	- Whether it's backdoored or benign
	- Backdoor attack type, injected trigger and target (if applicable)

	The ```METADATA.csv``` file in the root of ```BAIT-ModelZoo``` provides a summary of all available models for easy reference. Current model zoo contains 91 models. We will keep updating the model zoo with new models.