Spaces:

TDDBench
/

README

Configuration error

App Files Files Community

ust-zzh commited on Jul 23

Commit

f134b51

verified ·

1 Parent(s): d745f0f

Update README.md

Browse files

Files changed (1) hide show

README.md +53 -10

README.md CHANGED Viewed

@@ -1,10 +1,53 @@
----
-title: README
-emoji: 🚀
-colorFrom: yellow
-colorTo: blue
-sdk: static
-pinned: false
----
-Edit this `README.md` markdown file to author your organization card.

+# TDDBench: A Benchmark for Training data detection
+This code is the official implementation of TDDBench.
+## Note ⭐⭐⭐
+We have uploaded the datasets and target models used by TDDBench on [Huggingface](https://huggingface.co/TDDBench) to facilitate a quick evaluation of the Training Data Detection algorithm. This includes 12 datasets and 60 target models, with plans to upload more data and target models in the future.
+To load an evaluation dataset, you can use the following code:
+```python
+# Load dataset
+from datasets import load_dataset
+dataset_name = "student"
+dataset_path = f"TDDBench/{dataset_name}"
+dataset = load_dataset(dataset_path)["train"]
+```
+To load a target model, you can use the following code:
+```python
+from transformers import AutoConfig, AutoModel
+from hfmodel import MLPConfig, MLPHFModel, WRNConfig, WRNHFModel
+# Register the MLPConfig and MLPHFModel to automatically load our model architecture.
+AutoConfig.register("mlp", MLPConfig)
+AutoModel.register(MLPConfig, MLPHFModel)
+# Load target model
+dataset_name = "student" # Training dataset name
+model_name = "mlp" # Target model architecture
+model_idx = 0 # To reduce statistical error, we train five different target models for each model architecture and training dataset.
+model_path = f"TDDBench/{model_name}-{dataset_name}-{model_idx}"
+model = AutoModel.from_pretrained(model_path)
+# Load training data detection label, 1 means model's training data while 0 means model's non-training data
+config = AutoConfig.from_pretrained(model_path)
+tdd_label = np.array(config.tdd_label)
+```
+The [demo.ipynb](https://github.com/zzh9568/TDDBench/blob/main/demo.ipynb) file in our [release code](https://github.com/zzh9568/TDDBench) hub offers a straightforward example of how to download the target model and dataset from Hugging Face, along with instructions for recording the output loss of the model for both training and non-training data.
+### References
+```python
+@article{zhu2024tddbench,
+  title={TDDBench: A Benchmark for Training data detection},
+  author={Zhu, Zhihao and Yang, Yi and Lian, Defu},
+  journal={arXiv preprint arXiv:2411.03363},
+  year={2024}
+}
+```