Spaces:

AnonymousResearch
/

WatermarkLeaderboard

Sleeping

Copy files from original watermark leaderboard

40b3335 about 1 month ago

1.96 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

Reproducibility Codes

This folder contains the Python scripts needed to reproduce the watermark performance results shown in the leaderboard.

C4_dataset_download.py: Downloads and prepares the C4 dataset for watermark evaluation
CNN_dataset_download.py: Downloads and prepares the CNN/DailyMail dataset for evaluation

Finetune_sum.py: Fine-tunes language models for watermark evaluation
Inference_sum.py: Performs inference with watermarked models to generate test data

BERT_score.py: Computes BERT scores for text quality evaluation
Entity_similarity_score.py: Calculates entity similarity scores for watermark detection
Attack_dipper.py: Implements watermark removal attacks for robustness testing

Environment Setup: Ensure you have the required dependencies installed (transformers, datasets, etc.)

Dataset Preparation: Run the dataset download scripts first

python C4_dataset_download.py
python CNN_dataset_download.py

Evaluation: Run the evaluation metrics

python BERT_score.py
python Entity_similarity_score.py
python Attack_dipper.py

For detailed instructions on each metric evaluation, refer to the main guidelines in the leaderboard application.