Spaces:

williamphoenix
/

Mithridatium

Running

App Files Files Community

Mithridatium / README.md

Will Phoenix

Add HF Spaces metadata

ab2aaf9 about 2 months ago

preview code

raw

history blame contribute delete

5.11 kB

	---
	title: Mithridatium
	emoji: 🛡️
	colorFrom: blue
	colorTo: indigo
	sdk: gradio
	app_file: app.py
	python_version: "3.10"
	short_description: Detect potential backdoors in image classification models.
	---

	# Mithridatium 🛡️

	A framework for verifying the integrity of pretrained AI models

	Mithridatium is a research-driven project aimed at detecting backdoors and data poisoning in downloaded pretrained models or pipelines (e.g., from Hugging Face).
	Our goal is to provide a modular, command-line tool that helps researchers and engineers trust the models they use.

	---

	## 🚀 Project Overview

	Modern ML pipelines often reuse pretrained weights from online repositories.
	This comes with risks:

	- ❌ Backdoors — models behave normally until triggered by a specific pattern.
	- ❌ Data poisoning — compromised training data leading to biased or malicious models.

	Mithridatium analyzes pretrained models to flag potential compromises using multiple defenses from academic research.

	---

	## Other Functionaly will be updated as the project goes on

	## Hugging Face Spaces

	This branch is configured for Gradio Spaces with `app.py` as the entrypoint.

	- Local checkpoint flow: set provider to `torchvision` in the UI.
	- Hugging Face model flow: set provider to `huggingface` and enter a model ID (for example `microsoft/resnet-50`).

	## Quickstart

	```bash
	python -m venv .venv && source .venv/bin/activate
	pip install -e ".[ui,hf]"
	pip install pytest pytest-cov

	# (A) Train demo models (fast settings)

	# Clean model on 5 epochs (Increase epochs for better accuracy, but it will take longer)
	python -m scripts.train_resnet18 --dataset clean --epochs 5 --output_path models/resnet18_clean.pth

	# Poisoned model on 5 epochs (increase epochs for better accuracy)
	python -m scripts.train_resnet18 --dataset poison --train_poison_rate 0.1 --target_class 0 \
	--epochs 5 --output_path models/resnet18_poison.pth

	# Invisible-trigger model using a small universal perturbation
	python -m scripts.train_resnet18 --dataset invisible --train_poison_rate 0.1 --target_class 0 \
	--uap-norm 2 --uap-xi 0.05 --poison_loss_weight 2.0 \
	--epochs 5 --output_path models/resnet18_invisible.pth

	# (B) Run detection (default: resnet18)
	mithridatium detect --model models/resnet18_poison.pth --defense mmbd --data cifar10 --out reports/mmbd.json

	# (B2) Run FreeEagle detection with optional overrides
	mithridatium detect --model models/resnet18_poison.pth --defense freeeagle --data cifar10 \
	--freeeagle-anomaly-threshold 2.5 --freeeagle-optimize-steps 100 --out reports/freeeagle.json

	# (Optional) Run against a Hugging Face model ID instead of a local checkpoint
	mithridatium detect --provider huggingface --hf-model-id microsoft/resnet-50 --defense mmbd --data cifar10_for_imagenet --out reports/mmbd_hf.json

	# (C) See summary
	cat reports/mmbd.json
	```

	## CLI Help

	To see all available options and arguments:

	```bash
	mithridatium detect --help
	```

	Example output:

	```
	Usage: mithridatium detect [OPTIONS]

	Options:
	--model, -m TEXT Local model path (.pth/.pt) when using --provider torchvision.
	--data, -d TEXT Dataset name (e.g., cifar10, cifar10_for_imagenet).
	--defense, -D TEXT Defense: mmbd, strip, aeva, freeeagle.
	--provider, -p TEXT Model provider: torchvision or huggingface.
	--hf-model-id TEXT Hugging Face model ID when --provider huggingface is used.
	--freeeagle-num-classes INTEGER
	FreeEagle override for number of classes. Use 0 to auto-infer from model head. [default: 0]
	--freeeagle-num-dummy INTEGER
	FreeEagle number of dummy optimization vectors. [default: 1]
	--freeeagle-num-important-neurons INTEGER
	FreeEagle top neurons used when computing tendency. [default: 5]
	--freeeagle-metric TEXT
	FreeEagle anomaly metric (e.g. 'softmax_score'). [default: softmax_score]
	--freeeagle-use-transpose-correction
	Enable transpose correction inside FreeEagle.
	--freeeagle-bound-on / --freeeagle-no-bound-on
	Enable or disable bounded optimization in FreeEagle. [default: freeeagle-bound-on]
	--freeeagle-optimize-steps INTEGER
	FreeEagle optimization steps. [default: 300]
	--freeeagle-learning-rate FLOAT
	FreeEagle optimization learning rate. [default: 0.01]
	--freeeagle-weight-decay FLOAT
	FreeEagle optimization weight decay. [default: 0.005]
	--freeeagle-anomaly-threshold FLOAT
	Threshold for FreeEagle anomaly_metric verdict. [default: 2.0]
	--freeeagle-inspect-layer-position INTEGER
	ResNet stage index inspected by FreeEagle (0..4). [default: 2]
	--out, -o TEXT The output path for the JSON report. Use "-" for stdout or a file path (e.g. "reports/report.json"). [default: reports/report.json]
	--force, -f This allows overwriting. E.g. if the output file already exists --force will overwrite it.
	--help Show this message and exit.
	```