kirudang's picture
Copy files from original watermark leaderboard
40b3335

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

Reproducibility Codes

This folder contains the Python scripts needed to reproduce the watermark performance results shown in the leaderboard.

Scripts Overview

Dataset Preparation

  • C4_dataset_download.py: Downloads and prepares the C4 dataset for watermark evaluation
  • CNN_dataset_download.py: Downloads and prepares the CNN/DailyMail dataset for evaluation

Model Training & Inference

  • Finetune_sum.py: Fine-tunes language models for watermark evaluation
  • Inference_sum.py: Performs inference with watermarked models to generate test data

Evaluation Metrics

  • BERT_score.py: Computes BERT scores for text quality evaluation
  • Entity_similarity_score.py: Calculates entity similarity scores for watermark detection
  • Attack_dipper.py: Implements watermark removal attacks for robustness testing

Usage Instructions

  1. Environment Setup: Ensure you have the required dependencies installed (transformers, datasets, etc.)

  2. Dataset Preparation: Run the dataset download scripts first

    python C4_dataset_download.py
    python CNN_dataset_download.py
    
  3. Model Training: Fine-tune your models

    python Finetune_sum.py
    
  4. Inference: Generate watermarked text

    python Inference_sum.py
    
  5. Evaluation: Run the evaluation metrics

    python BERT_score.py
    python Entity_similarity_score.py
    python Attack_dipper.py
    

Requirements

  • Python 3.8+
  • PyTorch
  • Transformers library
  • Datasets library
  • Other dependencies as specified in each script

Notes

  • Modify the configuration parameters in each script according to your setup
  • Ensure you have sufficient computational resources for training and evaluation
  • Results may vary based on random seeds and hardware differences

For detailed instructions on each metric evaluation, refer to the main guidelines in the leaderboard application.