kirudang commited on
Commit
40b3335
·
1 Parent(s): bc4e081

Copy files from original watermark leaderboard

Browse files
.gitignore ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
2
+
3
+ # dependencies
4
+ /node_modules
5
+ /.pnp
6
+ .pnp.js
7
+
8
+ # testing
9
+ /coverage
10
+
11
+ # production
12
+ /build
13
+
14
+ # misc
15
+ .DS_Store
16
+ .env.local
17
+ .env.development.local
18
+ .env.test.local
19
+ .env.production.local
20
+
21
+ npm-debug.log*
22
+ yarn-debug.log*
23
+ yarn-error.log*
CHANGELOG.md ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 📋 Changelog - Watermark Leaderboard Fixes
2
+
3
+ ## Version 2.0 - December 2024
4
+
5
+ ### 🐛 Bug Fixes
6
+
7
+ #### 1. Fixed Submission Validation
8
+ **Problem**: Users could only submit Attack-free data. Watermark Removal and Stealing Attack submissions failed validation.
9
+
10
+ **Solution**:
11
+ - Updated validation logic to accept any combination of attack types
12
+ - Users can now submit:
13
+ - Only Attack-free data
14
+ - Only Watermark Removal data
15
+ - Only Stealing Attack data
16
+ - Any combination of the above
17
+
18
+ **Code Changes**:
19
+ ```python
20
+ # Before: Required Attack-free fields
21
+ if normalized_utility is None or detection_rate is None:
22
+ return "❌ Error: Normalized Utility and Detection Rate are required"
23
+
24
+ # After: Flexible validation
25
+ has_attack_free_data = normalized_utility is not None and detection_rate is not None
26
+ has_removal_data = absolute_utility_degradation is not None and removal_detection_rate is not None
27
+ has_stealing_data = adversary_bert_score is not None and adversary_detection_rate is not None
28
+
29
+ if not has_attack_free_data and not has_removal_data and not has_stealing_data:
30
+ return "❌ Error: Please provide at least one complete set of metrics"
31
+ ```
32
+
33
+ #### 2. Enhanced Pending Submissions Display
34
+ **Problem**: Pending submissions table only showed basic fields (ID, Name, Model, Normalized Utility, Detection Rate, Submitted At).
35
+
36
+ **Solution**:
37
+ - Updated table to show ALL submission fields
38
+ - Administrators can now see complete submission details for proper review
39
+
40
+ **New Fields Displayed**:
41
+ - ID, Name, Model, Paper Link
42
+ - Attack-free Utility, Attack-free Detection
43
+ - Removal Degradation, Removal Detection
44
+ - Adversary BERT, Adversary Detection
45
+ - Submitted At
46
+
47
+ ### ✨ New Features
48
+
49
+ #### 1. Paper Link Field
50
+ - Added optional paper link field to submissions
51
+ - Links are displayed in the pending submissions table
52
+
53
+ #### 2. Enhanced User Guidance
54
+ - Added clear submission requirements in the form
55
+ - Better error messages with specific guidance
56
+ - Visual indicators for required vs optional fields
57
+
58
+ #### 3. Improved Form Labels
59
+ - Changed "Attack-free (Required)" to "Attack-free (Optional - Both Required if One is Provided)"
60
+ - Made it clear that all attack types are optional but pairs must be complete
61
+
62
+ ### 🔧 Technical Improvements
63
+
64
+ #### 1. Better Validation Logic
65
+ - Separate validation for each attack type
66
+ - Clear error messages for each validation failure
67
+ - Consistent range validation across all metrics
68
+
69
+ #### 2. Enhanced Data Structure
70
+ - Improved pending submissions data formatting
71
+ - Better handling of optional fields
72
+ - Consistent data types across all metrics
73
+
74
+ #### 3. Updated Dependencies
75
+ - Added numpy requirement for better data handling
76
+ - Updated Gradio version compatibility
77
+
78
+ ### 📊 User Experience Improvements
79
+
80
+ #### 1. Clearer Instructions
81
+ - Added submission requirements box in the form
82
+ - Better placeholder text and help information
83
+ - Consistent styling across all form sections
84
+
85
+ #### 2. Better Error Handling
86
+ - More specific error messages
87
+ - Guidance on how to fix validation errors
88
+ - Consistent error formatting
89
+
90
+ #### 3. Enhanced Admin Experience
91
+ - Complete field visibility in pending submissions
92
+ - Better table formatting with all metrics
93
+ - Improved approval workflow
94
+
95
+ ### 🚀 Deployment Ready
96
+
97
+ All changes are compatible with Hugging Face Spaces and ready for immediate deployment. The fixes maintain backward compatibility while significantly improving functionality and user experience.
DEPLOYMENT_GUIDE.md ADDED
@@ -0,0 +1,78 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🚀 Deployment Guide for Hugging Face Spaces
2
+
3
+ ## Quick Fix for Your Existing Space
4
+
5
+ Your Hugging Face Space at https://huggingface.co/spaces/kirudang/watermark-leaderboard has been updated with the following fixes:
6
+
7
+ ### ✅ Fixed Issues
8
+
9
+ 1. **Flexible Submission Validation**
10
+ - Now accepts any combination of attack types
11
+ - Submit only Attack-free, only Watermark Removal, only Stealing Attack, or any combination
12
+ - Clear validation messages guide users
13
+
14
+ 2. **Complete Pending Submissions Table**
15
+ - Shows ALL fields: ID, Name, Model, Paper Link, Attack-free metrics, Watermark Removal metrics, Stealing Attack metrics, Submitted At
16
+ - Administrators can see complete submission details for proper review
17
+
18
+ 3. **Enhanced User Experience**
19
+ - Clear submission requirements displayed in the form
20
+ - Better error messages
21
+ - Paper link field added
22
+
23
+ ## 📁 Files to Update in Your Hugging Face Space
24
+
25
+ Upload these updated files to your Space:
26
+
27
+ 1. **app.py** - Main application with all fixes
28
+ 2. **requirements.txt** - Updated dependencies
29
+ 3. **README.md** - Updated documentation
30
+ 4. **leaderboard.json** - Latest data (if needed)
31
+
32
+ ## 🔄 How to Deploy
33
+
34
+ ### Option 1: Git Push (Recommended)
35
+ ```bash
36
+ # In your watermark-leaderboard directory
37
+ git add .
38
+ git commit -m "Fix submission validation and pending approval display"
39
+ git push origin main
40
+ ```
41
+
42
+ ### Option 2: Manual Upload
43
+ 1. Go to your Space: https://huggingface.co/spaces/kirudang/watermark-leaderboard
44
+ 2. Click "Files and versions" tab
45
+ 3. Upload the updated files:
46
+ - `app.py`
47
+ - `requirements.txt`
48
+ - `README.md`
49
+ - `leaderboard.json` (if you want to update data)
50
+
51
+ ## 🎯 What's Fixed
52
+
53
+ ### Before (Issues):
54
+ - ❌ Could only submit Attack-free data
55
+ - ❌ Pending submissions showed limited fields
56
+ - ❌ Confusing validation messages
57
+
58
+ ### After (Fixed):
59
+ - ✅ Can submit any combination: Attack-free, Watermark Removal, Stealing Attack
60
+ - ✅ Pending submissions show ALL fields for complete review
61
+ - ✅ Clear submission requirements and validation
62
+
63
+ ## 🔍 Testing the Fixes
64
+
65
+ After deployment, test these scenarios:
66
+
67
+ 1. **Submit only Stealing Attack data** (no Attack-free)
68
+ 2. **Submit only Watermark Removal data** (no Attack-free)
69
+ 3. **Submit combination of all three types**
70
+ 4. **Check pending submissions table shows all fields**
71
+
72
+ ## 🛠️ Admin Controls
73
+
74
+ - **Admin Password**: `admin123` (you can change this in app.py)
75
+ - **Pending Submissions**: Shows complete details for review
76
+ - **Approval Process**: Approve/reject with full visibility
77
+
78
+ Your Space will automatically rebuild when you push the changes!
Guideline to submit your watermark performance.docx ADDED
Binary file (17.7 kB). View file
 
README.md CHANGED
@@ -1,13 +1,57 @@
1
  ---
2
- title: WatermarkLeaderboard
3
  emoji: 🏆
4
- colorFrom: gray
5
- colorTo: yellow
6
  sdk: gradio
7
- sdk_version: 5.49.1
8
  app_file: app.py
9
  pinned: false
10
- short_description: 'Public Watermark Leaderboard '
 
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Watermark Leaderboard
3
  emoji: 🏆
4
+ colorFrom: blue
5
+ colorTo: green
6
  sdk: gradio
7
+ sdk_version: "4.44.0"
8
  app_file: app.py
9
  pinned: false
10
+ license: mit
11
+ short_description: Interactive leaderboard for watermark performance evaluation
12
  ---
13
 
14
+ # Watermark Leaderboard 🏆
15
+
16
+ An interactive leaderboard for comparing watermark performance across different models and evaluation settings.
17
+
18
+ ## Features
19
+
20
+ - **Interactive Scatter Plot**: Visualize watermark performance with Plotly charts
21
+ - **Performance Table**: Detailed metrics with sorting and filtering
22
+ - **Multiple Evaluation Settings**: Attack-free, Watermark Removal, and Stealing Attack
23
+ - **Model Support**: LLaMA3 and DeepSeek models
24
+ - **Dynamic Filtering**: Real-time updates based on model and metric selection
25
+ - **Flexible Submissions**: Submit data for any combination of attack types
26
+ - **Pending Approval System**: All submissions reviewed before appearing on leaderboard
27
+ - **Complete Field Visibility**: Administrators see all submission details for review
28
+ - **Professional UI**: Clean, modern interface with accordion sections
29
+ - **Reproducibility**: Access to all evaluation codes and guidelines
30
+
31
+ ## How to Use
32
+
33
+ 1. **Select Model**: Choose between LLaMA3 or DeepSeek
34
+ 2. **Choose Setting**: Pick from Attack-free, Watermark Removal, or Stealing Attack
35
+ 3. **View Results**: Explore the scatter plot and detailed table
36
+ 4. **Submit Data**: Click "Add Your Data" to submit new results
37
+ - Submit any combination of attack types (Attack-free, Watermark Removal, Stealing Attack)
38
+ - All submissions go through approval process before appearing on leaderboard
39
+ 5. **Administrator Review**: Administrators can review pending submissions with full field visibility
40
+
41
+ ## Metrics Explained
42
+
43
+ - **Normalized Utility ↑**: Higher values indicate better text quality
44
+ - **Detection Rate (%) ↑**: Higher values indicate better watermark detection
45
+ - **Absolute Utility Degradation ↑**: Higher values indicate better resistance to removal attacks
46
+ - **Adversary BERT Score ↑**: Higher values indicate better performance under adversarial conditions
47
+
48
+ ## Contributing
49
+
50
+ We encourage researchers to contribute their evaluation results. Please follow the guidelines in the "Guidelines" section for submission requirements.
51
+
52
+ ## License
53
+
54
+ MIT License
55
+
56
+ ---
57
+ *Last updated: September 2024*
Reproducibility/Attack_dipper.py ADDED
@@ -0,0 +1,135 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ # Set the HF_HOME environment variable to point to the desired cache location
3
+ os.environ["HF_TOKEN"] = "Your_HuggingFace_Token_Here"
4
+ # Specify the directory path
5
+ cache_dir = 'Your_Desired_Cache_Directory_Here'
6
+ # Set the HF_HOME environment variable
7
+ os.environ['HF_HOME'] = cache_dir
8
+ import argparse
9
+ import json
10
+ import nltk
11
+ import time
12
+ import os
13
+ import tqdm
14
+
15
+ from nltk.tokenize import sent_tokenize
16
+
17
+ from transformers import T5Tokenizer, T5ForConditionalGeneration
18
+ import torch
19
+
20
+ #nltk.download("punkt")
21
+
22
+ def main(args):
23
+ # Clear the cache
24
+ torch.cuda.empty_cache()
25
+ # Load data from the specified JSON file
26
+ with open(args.data, 'r') as f:
27
+ data = json.load(f)
28
+ data = [{"query": item["input"], "output_with_watermark": item[args.column_in_data]} for item in data[4960:args.Ninputs]]
29
+
30
+ # Load the model and tokenizer
31
+ time1 = time.time()
32
+ tokenizer = T5Tokenizer.from_pretrained("google/t5-v1_1-xxl")
33
+ model = T5ForConditionalGeneration.from_pretrained(args.model_name)
34
+ print("Model loaded in ", time.time() - time1)
35
+ model.cuda()
36
+ model.eval()
37
+
38
+ # Initialize lists to store the attacked output and the inputs for the paraphrase model
39
+ attack_results = []
40
+ input_counter = 0
41
+
42
+ # Iterate over the data
43
+ for idx, dd in tqdm.tqdm(enumerate(data), total=len(data)):
44
+ print(f"Processing input {idx + 1} / {len(data)}")
45
+ input_gen = dd["output_with_watermark"].strip() if isinstance(dd["output_with_watermark"], str) else dd["output_with_watermark"][0].strip()
46
+
47
+ # Initialize dipper_inputs and w_wm_output_attacked to empty lists
48
+ dipper_inputs = []
49
+ w_wm_output_attacked = []
50
+
51
+ assert args.lex in [0, 20, 40, 60, 80, 100], "Lexical diversity must be one of 0, 20, 40, 60, 80, 100."
52
+ assert args.order in [0, 20, 40, 60, 80, 100], "Order diversity must be one of 0, 20, 40, 60, 80, 100."
53
+ # Calculate the control codes for the paraphrase model
54
+ lex_code = int(100 - args.lex)
55
+ order_code = int(100 - args.order)
56
+
57
+ # Remove spurious newlines
58
+ # removing all extra whitespace from
59
+ input_gen = " ".join(input_gen.split())
60
+ # Split the input into sentences
61
+ sentences = sent_tokenize(input_gen)
62
+ # White space removal
63
+ prefix = " ".join(dd["query"].replace("\n", " ").split())
64
+ output_text = ""
65
+ final_input_text = ""
66
+
67
+ # Generate the paraphrase for each sentence window
68
+ for sent_idx in range(0, len(sentences), args.sent_interval):
69
+ curr_sent_window = " ".join(sentences[sent_idx : sent_idx + args.sent_interval])
70
+ if args.no_ctx:
71
+ final_input_text = f"lexical = {lex_code}, order = {order_code} <sent> {curr_sent_window} </sent>"
72
+ else:
73
+ final_input_text = f"lexical = {lex_code}, order = {order_code} {prefix} <sent> {curr_sent_window} </sent>"
74
+
75
+ final_input = tokenizer([final_input_text], return_tensors="pt")
76
+ final_input = {k: v.cuda() for k, v in final_input.items()}
77
+
78
+ # Generate the paraphrase
79
+ with torch.inference_mode():
80
+ outputs = model.generate(
81
+ **final_input,
82
+ do_sample=True,
83
+ top_p=0.75,
84
+ top_k=None,
85
+ max_length=400
86
+ )
87
+ outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)
88
+ prefix += " " + outputs[0]
89
+ output_text += " " + outputs[0]
90
+
91
+ # Store the attacked output and the input for the paraphrase model
92
+ w_wm_output_attacked.append(output_text.strip())
93
+ dipper_inputs.append(final_input_text)
94
+
95
+ # Create a dictionary with the four specified columns
96
+ result = {
97
+ "original_query": dd["query"],
98
+ "watermarked_response": dd["output_with_watermark"],
99
+ #"final_input_text": dipper_inputs,
100
+ "paraphrased_response": w_wm_output_attacked[0]
101
+
102
+ }
103
+ # Add the result to the list of results
104
+ attack_results.append(result)
105
+
106
+ # Increment the input counter
107
+ input_counter += 1
108
+
109
+ # Save the results after processing every saving_freq inputs
110
+ if input_counter % args.saving_freq == 0:
111
+ # Save the generated data to a JSON file
112
+ # Check if the file exists
113
+ if os.path.isfile(f"{args.output_name}_{input_counter-args.saving_freq}.json"):
114
+ os.remove(f"{args.output_name}_{input_counter-args.saving_freq}.json")
115
+
116
+ with open(f"{args.output_name}_{input_counter}.json", "w") as json_file:
117
+ json.dump(attack_results, json_file, indent=4)
118
+
119
+
120
+ if __name__ == "__main__":
121
+ parser = argparse.ArgumentParser(description="Attack by Dipper Paraphrasing")
122
+ parser.add_argument("--data",type=str,default="Llama3_SIR_test_13860.json", help="The data to be attacked / paraphrased.")
123
+ parser.add_argument("--column_in_data", type=str, default="output_only", help="Column in the data to be attacked / paraphrased.")
124
+ parser.add_argument("--output_name", type=str, default="Dipper_Llama3_SIR_13860_4960_", help="The output directory to save the attacked / paraphrased data.")
125
+ parser.add_argument("--Ninputs", type=int, default=13860, help="Number of inputs to be attacked / paraphrased.")
126
+ parser.add_argument("--saving_freq", type=int, default=10, help="The frequency of saving the output.")
127
+ parser.add_argument("--model_name", type=str, default="kalpeshk2011/dipper-paraphraser-xxl", help="The model name to use.")
128
+ parser.add_argument("--no_ctx", type=bool, default=True, help="Whether to use context or not.")
129
+ parser.add_argument("--sent_interval", type=int, default=3,help="The sentence interval.")
130
+ parser.add_argument("--lex",type=int, default=60, help="Lexical diversity knob for the paraphrase attack.")
131
+ parser.add_argument("--order",type=int,default=60,help="Order diversity knob for the paraphrase attack.")
132
+
133
+ args = parser.parse_args()
134
+
135
+ main(args)
Reproducibility/BERT_score.py ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from bert_score import score
3
+ import json
4
+ import argparse
5
+ import csv
6
+ import torch
7
+ import warnings
8
+ warnings.filterwarnings("ignore")
9
+
10
+ # Ensure the HF_HOME environment variable points to your desired cache location
11
+ os.environ["HF_TOKEN"] = "Your_HF_TOKEN"
12
+ cache_dir = 'Your_cache_directory'
13
+
14
+ def main(args):
15
+ start = 0
16
+ # Clear the cache
17
+ torch.cuda.empty_cache()
18
+
19
+ # Load Candidate and Reference Files if they are from the same file.
20
+ with open(args.data_can, 'r') as f:
21
+ data_1 = json.load(f)[start:args.N]
22
+ cands = [item["Watermarked_summary"] for item in data_1]
23
+ # randomized_words = [item["Total_randomized_words"] for item in data_1]
24
+ # total_words = [item["Total_words"] for item in data_1]
25
+
26
+
27
+ with open(args.data_ref, 'r') as f:
28
+ data_2 = json.load(f)[start:args.N]
29
+ refs = [item["summary"] for item in data_2]
30
+
31
+ # Set saving frequency
32
+ saving_freq = 10
33
+ # Initialize input counter
34
+ input_counter = 0
35
+ # Loop through the output text and detect the watermark
36
+ results = []
37
+ # Loop through the data and calculate the BERTScore
38
+ for i, item in enumerate(cands):
39
+ num_tokens = len(item.split())
40
+ print(f"Item number: {i}")
41
+
42
+ if num_tokens >= 16: # Only consider items with at least 16 tokens for valid assessment
43
+ P, R, F1 = score([cands[i]], [refs[i]], lang="en", verbose=True)
44
+ scores = F1.mean().item()
45
+ #results.append([i, scores, randomized_words[i], total_words[i]])
46
+ results.append([i, scores])
47
+
48
+ else:
49
+ print(f"Skipping item number {i} due to insufficient tokens.")
50
+ # Write the results to a CSV file
51
+ # Increment input counter
52
+ input_counter += 1
53
+
54
+ # Save the results after processing every saving_freq inputs
55
+ if input_counter % saving_freq == 0:
56
+ # Check if the file exits
57
+ if os.path.isfile(f"{args.Output_name}{start}_{input_counter-saving_freq}.csv"):
58
+ os.remove(f"{args.Output_name}{start}_{input_counter-saving_freq}.csv")
59
+
60
+ with open(f'{args.Output_name}{start}_{input_counter}.csv', 'w', newline='') as f:
61
+ writer = csv.writer(f)
62
+ writer.writerow(["data_item", 'BERTScore'])
63
+ writer.writerows(results)
64
+
65
+ if __name__ == '__main__':
66
+ parser = argparse.ArgumentParser(description='Calculate BERTScore')
67
+ parser.add_argument('--data_can',default= 'DeepSeek_TW_Summarization_test__1000.json',type=str, help='a file containing the candidate document to test')
68
+ parser.add_argument('--data_ref',default= 'DeepSeek_No_WM_Summarization_test_0_1000_1000.json',type=str, help='a file containing the reference document to test')
69
+ parser.add_argument('--N', default= 1000, type=int, help='Number of data items to process')
70
+ parser.add_argument('--Output_name', default= "BERTScore_DeepSeek_Summarization_TW_ref_No_WM_", type=str, help='Name of the output file')
71
+ main(parser.parse_args())
Reproducibility/C4_dataset_download.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ # Ensure the HF_HOME environment variable points to your desired cache location
3
+ os.environ["HF_TOKEN"] = "Your_HF_Token"
4
+ cache_dir = 'Your_Cache_Dir'
5
+ os.environ['HF_HOME'] = cache_dir
6
+
7
+ from datasets import load_dataset
8
+
9
+ # Specify local directory for caching
10
+ dataset_path = "./c4_realnewslike"
11
+
12
+ # Load only the "realnewslike" subset of train and validation
13
+ dataset = load_dataset("allenai/c4", "realnewslike", cache_dir=dataset_path)
14
+
15
+ # Print confirmation
16
+ print("Dataset downloaded and stored at:", dataset_path)
17
+
18
+ # Print the number of samples in each subset
19
+ print("Number of training samples:", len(dataset["train"]))
20
+ print("Number of validation samples:", len(dataset["validation"]))
Reproducibility/CNN_dataset_download.py ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ # Ensure the HF_HOME environment variable points to your desired cache location
3
+ os.environ["HF_TOKEN"] = "Your_HF_Token"
4
+ cache_dir = 'Your_Cache_Dir'
5
+ os.environ['HF_HOME'] = cache_dir
6
+ import json
7
+ from datasets import load_dataset
8
+
9
+
10
+ # Set dataset save path
11
+ save_path = "cnn.json"
12
+ if not os.path.exists(save_path):
13
+ dataset = load_dataset("abisee/cnn_dailymail", "3.0.0")
14
+ train_data = dataset["train"][:20000]
15
+ test_data = dataset["test"][:1000]
16
+
17
+ data_subset = []
18
+
19
+ for article, highlights, data_id in zip(train_data["article"], train_data["highlights"], train_data["id"]):
20
+ data_subset.append({
21
+ "id": data_id,
22
+ "article": article,
23
+ "highlights": highlights,
24
+ "type": "train"
25
+ })
26
+
27
+ for article, highlights, data_id in zip(test_data["article"], test_data["highlights"], test_data["id"]):
28
+ data_subset.append({
29
+ "id": data_id,
30
+ "article": article,
31
+ "highlights": highlights,
32
+ "type": "test"
33
+ })
34
+
35
+ with open(save_path, "w", encoding="utf-8") as f:
36
+ json.dump(data_subset, f, ensure_ascii=False, indent=4)
37
+
38
+ print(f"Data saved to {save_path}")
Reproducibility/Entity_similarity_score.py ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ import spacy
3
+ from sklearn.metrics.pairwise import cosine_similarity
4
+ from difflib import SequenceMatcher
5
+ import numpy as np
6
+ import pandas as pd
7
+ from tqdm import tqdm
8
+
9
+
10
+ #Data
11
+ ref_data = "DeepSeek7b_No_WM_test_13860.json"
12
+ ref_column = "output_only"
13
+
14
+ cand_data = "Dipper_DeepSeek_TW_13860.json"
15
+ cand_column ="paraphrased_response"
16
+
17
+ output_name = "Entity_Dipper_DeepSeek_TW.csv"
18
+ N = 13860
19
+
20
+ # Use SPACY
21
+ # Load spaCy's Named Entity Recognition (NER) model
22
+ nlp = spacy.load("en_core_web_sm")
23
+
24
+ # Function to extract named entities from text
25
+ def extract_named_entities(text):
26
+ doc = nlp(text)
27
+ return [ent.text for ent in doc.ents]
28
+
29
+
30
+ # === Similarity Calculation ===
31
+ def compute_similarity(entity1, entity2):
32
+ if entity1 == entity2:
33
+ return 1.0, 1.0, 1.0
34
+
35
+ lev_similarity = SequenceMatcher(None, entity1, entity2).ratio()
36
+
37
+ vec1 = nlp(entity1).vector.reshape(1, -1)
38
+ vec2 = nlp(entity2).vector.reshape(1, -1)
39
+
40
+ if np.any(vec1) and np.any(vec2):
41
+ cos_similarity = cosine_similarity(vec1, vec2)[0][0]
42
+ else:
43
+ cos_similarity = 0.0
44
+
45
+ combined_similarity = (lev_similarity + cos_similarity) / 2
46
+ return combined_similarity, lev_similarity, cos_similarity
47
+
48
+ # === Greedy Pairwise Matching ===
49
+ def greedy_pairwise_matching(ref_entities, cand_entities):
50
+ matched_entities = []
51
+ cand_entities_copy = cand_entities.copy()
52
+
53
+ for ref_entity in ref_entities:
54
+ best_match = None
55
+ best_score = 0
56
+ best_lev_similarity = 0
57
+ best_cos_similarity = 0
58
+
59
+ for cand_entity in cand_entities_copy:
60
+ similarity_score, lev_similarity, cos_similarity = compute_similarity(ref_entity, cand_entity)
61
+
62
+ if similarity_score > best_score:
63
+ best_score = similarity_score
64
+ best_match = cand_entity
65
+ best_lev_similarity = lev_similarity
66
+ best_cos_similarity = cos_similarity
67
+
68
+ if best_match:
69
+ matched_entities.append((ref_entity, best_match, best_score, best_lev_similarity, best_cos_similarity))
70
+ cand_entities_copy.remove(best_match)
71
+ else:
72
+ matched_entities.append((ref_entity, "MISSING", 0, 0, 0))
73
+
74
+ for cand_entity in cand_entities_copy:
75
+ matched_entities.append(("NEW ENTITY", cand_entity, 0, 0, 0))
76
+
77
+ return matched_entities
78
+
79
+ # === Load Data ===
80
+ with open(ref_data, "r", encoding="utf-8") as ref_file:
81
+ reference_data = [entry[ref_column] for entry in json.load(ref_file)[:N]]
82
+
83
+ with open(cand_data, "r", encoding="utf-8") as cand_file:
84
+ candidate_data = [entry[cand_column] for entry in json.load(cand_file)[:N]]
85
+
86
+ assert len(reference_data) == len(candidate_data), "Mismatch in data point count!"
87
+
88
+
89
+ # === Process Each Pair ===
90
+ results = []
91
+
92
+ for idx, (ref_text, cand_text) in enumerate(tqdm(zip(reference_data, candidate_data), total=len(reference_data))):
93
+ ref_entities = extract_named_entities(ref_text)
94
+ cand_entities = extract_named_entities(cand_text)
95
+
96
+ if len(ref_entities) == 0 and len(cand_entities) == 0:
97
+ continue
98
+
99
+ matched_entities = greedy_pairwise_matching(ref_entities, cand_entities)
100
+
101
+ # Compute similarity lists
102
+ cosine_similarities = [match[4] for match in matched_entities if match[4] > 0]
103
+ levenshtein_similarities = [match[3] for match in matched_entities if match[3] > 0]
104
+
105
+ avg_cosine_similarity = np.mean(cosine_similarities) if cosine_similarities else 0.0
106
+ avg_levenshtein_similarity = np.mean(levenshtein_similarities) if levenshtein_similarities else 0.0
107
+ avg_similarity = (avg_cosine_similarity + avg_levenshtein_similarity) / 2
108
+
109
+ # Calculate exact match count
110
+ exact_match_pairs = sum(1 for match in matched_entities if match[0] == match[1])
111
+
112
+ # Union count
113
+ union_count = len(ref_entities) + len(cand_entities) - exact_match_pairs
114
+ union_count = union_count if union_count > 0 else 1 # Avoid division by zero
115
+
116
+ # Final score
117
+ final_score = (avg_similarity / union_count) * max(len(ref_entities), len(cand_entities))
118
+
119
+ results.append({
120
+ "Index": idx,
121
+ "Reference_Entity_Count": len(ref_entities),
122
+ "Candidate_Entity_Count": len(cand_entities),
123
+ "Reference_Entities": ref_entities,
124
+ "Candidate _Entities": cand_entities,
125
+ "Matched_Entities": matched_entities,
126
+ "Exact_Match Pairs": exact_match_pairs,
127
+ "Union_Count": union_count,
128
+ "Average_Cosine_Similarity": avg_cosine_similarity,
129
+ "Average_Levenshtein_Similarity": avg_levenshtein_similarity,
130
+ "Average_Combined_Similarity_Score": avg_similarity,
131
+ "Final_Score": final_score
132
+ })
133
+
134
+ # === Save to CSV ===
135
+ results_df = pd.DataFrame(results)
136
+ results_df.to_csv(output_name, index=False)
137
+
138
+ print(f'Average Final Score: {results_df["Final_Score"].mean()}')
Reproducibility/Finetune_sum.py ADDED
@@ -0,0 +1,298 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ # Set the HF_HOME environment variable to point to the desired cache location
3
+ # os.environ["HF_TOKEN"] = "your_hugging_face_token_here" # Replace with your Hugging Face token
4
+ # Specify the directory path
5
+ cache_dir = '/network/rit/lab/Lai_ReSecureAI/kiel/wmm'
6
+ # Set the HF_HOME environment variable
7
+ os.environ['HF_HOME'] = cache_dir
8
+
9
+ os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"
10
+
11
+ import matplotlib.pyplot as plt
12
+ import logging
13
+ import time
14
+ import torch
15
+ import json
16
+ import torch.nn as nn
17
+ from typing import Optional
18
+ import pandas as pd
19
+ from datasets import Dataset
20
+ from peft import LoraConfig, PeftModel, prepare_model_for_kbit_training
21
+ from dataclasses import dataclass, field
22
+ from transformers import (
23
+ HfArgumentParser,
24
+ AutoTokenizer,
25
+ TrainingArguments,
26
+ BitsAndBytesConfig,
27
+ TrainerCallback,
28
+ AutoModelForCausalLM
29
+ )
30
+ from trl import SFTTrainer
31
+ import warnings
32
+
33
+
34
+ # Ignore all warnings
35
+ warnings.filterwarnings("ignore")
36
+
37
+ # Set up logging
38
+ logging.basicConfig(level=logging.INFO)
39
+ logger = logging.getLogger(__name__)
40
+
41
+ # Clear cache
42
+ torch.cuda.empty_cache()
43
+ device = "cuda" if torch.cuda.is_available() else "cpu"
44
+
45
+ # Initialize parameters
46
+ model_name = "Llama2" #'DeepSeek' #'Llama3' #
47
+ WM = "TW"
48
+ num_data = 10000
49
+ num_epochs = 5
50
+ learning_rate_ = 1e-5
51
+
52
+
53
+ # Print parameters
54
+ print(f'Device: {device}')
55
+ print(f'Model: {model_name}')
56
+ print(f'WM: {WM}')
57
+ print(f'Number of data: {num_data}')
58
+ print(f'Number of epochs: {num_epochs}')
59
+ print(f'Learning rate: {learning_rate_}')
60
+
61
+ start_time = time.time()
62
+ # Load data
63
+ def load_data(file_path, num_data):
64
+ with open(file_path, 'r') as f:
65
+ data = json.load(f)
66
+ return [
67
+ {
68
+ "text": "Now summarize the following text with maximum 60 words: " +
69
+ item["article"] +
70
+ "\nThe summary is: " +
71
+ item['Watermarked_summary']
72
+ }
73
+ for item in data[:num_data]
74
+ ]
75
+
76
+
77
+ # Create dataset
78
+ def create_dataset(data):
79
+ """
80
+ Convert the concatenated data into a Hugging Face Dataset format.
81
+ """
82
+ df = pd.DataFrame(data) # Each element in 'data' is a dictionary with 'text' as the key
83
+ return Dataset.from_pandas(df)
84
+
85
+ def get_file_paths(model_name,WM):
86
+ base_path = '/network/rit/lab/Lai_ReSecureAI/kiel/Website/Stealing/'
87
+ if WM == "SafeSeal":
88
+ paths = {
89
+ 'DeepSeek': ('DeepSeek_train_Summarization_Safeseal_top_3_threshold_0.8_Uniform_0_20000_20k.json', 'DeepSeek_test_Summarization_Safeseal_top_3_threshold_0.8_Uniform_0_1000_1000.json'),
90
+ 'Llama3': ('Llama3_train_Summarization_Safeseal_top_3_threshold_0.8_Uniform_0_20000_20k.json', 'Llama3_test_Summarization_Safeseal_top_3_threshold_0.8_Uniform_0_1000_1000.json')
91
+ }
92
+ elif WM == "DTM":
93
+ paths = {
94
+ 'Llama3': ('Llama3_DTM_Summarization_train__20000.json', 'Llama3_DTM_Summarization_test__1000.json'),
95
+ 'DeepSeek': ('DeepSeek_DTM_Summarization_train__20000.json', 'DeepSeek_DTM_Summarization_test__1000.json'),
96
+ 'Llama2': ('Llama2_DTM_Summarization_train_20k.json', 'Llama2_DTM_Summary_test_1000.json'),
97
+ 'Mistral': ('Mistral_DTM_Summarization_train_20k.json', 'Mistral_DTM_Summary_test_1000.json')
98
+ }
99
+ elif WM == "KGW":
100
+ paths = {
101
+ 'Llama3': ('Llama3_KGW_Summarization_train_0_20000_20000.json', 'Llama3_KGW_Summarization_test_0_1000_1000.json'),
102
+ 'DeepSeek': ('DeepSeek_KGW_Summarization_train_0_20000_20000.json', 'DeepSeek_KGW_Summarization_test_0_1000_1000.json')
103
+ }
104
+ elif WM == "SIR":
105
+ paths = {
106
+ 'DeepSeek': ('DeepSeek_SIR_Summarization_train_0_20000_20000.json', 'DeepSeek_SIR_Summarization_test_0_1000_1000.json'),
107
+ 'Llama3': ('Llama3_SIR_Summarization_train_0_20000_20000.json', 'Llama3_SIR_Summarization_test_0_1000_1000.json')
108
+ }
109
+ elif WM == "SynthID":
110
+ paths = {
111
+ 'DeepSeek': ('DeepSeek_SynthID_Summarization_train_0_20000_20000.json', 'DeepSeek_SynthID_Summarization_test_0_1000_1000.json'),
112
+ 'Llama3': ('Llama3_SynthID_Summarization_train_0_20000_20000.json', 'Llama3_SynthID_Summarization_test_0_1000_1000.json')
113
+ }
114
+ elif WM == "TW":
115
+ paths = {
116
+ 'DeepSeek': ('DeepSeek_TW_Summarization_train_20000.json', 'DeepSeek_TW_Summarization_test__1000.json'),
117
+ 'Llama3': ('Llama3_TW_Summarization_train__20000.json', 'Llama3_TW_Summarization_test__1000.json'),
118
+ 'Llama2': ('Llama2_TW_Summarization_train_20k.json', 'Llama2_TW_Summary_test_1000.json'),
119
+ 'Mistral': ('Mistral_TW_Summarization_train_20k.json', 'Mistral_TW_Summary_Test_1000.json')
120
+ }
121
+
122
+ return base_path + paths[model_name][0], base_path + paths[model_name][1]
123
+
124
+ def get_new_model_path(model_name,WM, num_epochs, learning_rate_, num_data):
125
+ #return f"/network/rit/lab/Lai_ReSecureAI/phung/adversary_models/{model_name}_epoch{num_epochs}_lr{learning_rate_}_K{K}_Threshold{Threshold}_data{num_data}_testing_batch{batch_no}_"
126
+ return f"./adversary_models/{model_name}_{WM}_epoch{num_epochs}_lr{learning_rate_}_data{num_data}_"
127
+ #return f"/network/rit/lab/Lai_ReSecureAI/phung/adversary_models/{model_name}_{WM}_epoch{num_epochs}_lr{learning_rate_}_data{num_data}_"
128
+
129
+ train_file, test_file = get_file_paths(model_name, WM)
130
+ train_data = load_data(train_file, num_data)
131
+ test_data = load_data(test_file, num_data)
132
+
133
+ train_dataset = create_dataset(train_data)
134
+ test_dataset = create_dataset(test_data)
135
+
136
+ new_model = get_new_model_path(model_name, WM, num_epochs, learning_rate_, num_data)
137
+ print(f'New model path: {new_model}')
138
+
139
+ # Load parameters
140
+ @dataclass
141
+ class ScriptArguments:
142
+ use_8_bit: Optional[bool] = field(default=False, metadata={"help": "use 8 bit precision"})
143
+ use_4_bit: Optional[bool] = field(default=False, metadata={"help": "use 4 bit precision"})
144
+ bnb_4bit_quant_type: Optional[str] = field(default="nf4", metadata={"help": "precise the quantization type (fp4 or nf4)"})
145
+ use_bnb_nested_quant: Optional[bool] = field(default=False, metadata={"help": "use nested quantization"})
146
+ use_multi_gpu: Optional[bool] = field(default=True, metadata={"help": "use multi GPU"})
147
+ use_adapters: Optional[bool] = field(default=True, metadata={"help": "use adapters"})
148
+ batch_size: Optional[int] = field(default=8, metadata={"help": "input batch size"})
149
+ max_seq_length: Optional[int] = field(default=400, metadata={"help": "max sequence length"})
150
+ optimizer_name: Optional[str] = field(default="adamw_hf", metadata={"help": "Optimizer name"})
151
+
152
+ parser = HfArgumentParser(ScriptArguments)
153
+ script_args = parser.parse_args_into_dataclasses()[0]
154
+
155
+ # Device map
156
+ device_map = "auto" if script_args.use_multi_gpu else "cpu"
157
+
158
+ # Check precision settings
159
+ if script_args.use_8_bit and script_args.use_4_bit:
160
+ raise ValueError("You can't use 8 bit and 4 bit precision at the same time")
161
+
162
+ bnb_config = BitsAndBytesConfig(
163
+ load_in_4bit=True,
164
+ bnb_4bit_compute_dtype=torch.float16,
165
+ bnb_4bit_quant_type=script_args.bnb_4bit_quant_type,
166
+ bnb_4bit_use_double_quant=script_args.use_bnb_nested_quant,
167
+ ) if script_args.use_4_bit else None
168
+
169
+ # Load model and tokenizer
170
+ model = AutoModelForCausalLM.from_pretrained(
171
+ "meta-llama/Meta-Llama-3-8B" if model_name == 'Llama3'
172
+ else "meta-llama/Llama-2-7b-chat-hf" if model_name == 'Llama2'
173
+ else "mistralai/Mistral-7B-Instruct-v0.2" if model_name == 'Mistral'
174
+ else "deepseek-ai/deepseek-llm-7b-base",
175
+ cache_dir=cache_dir,
176
+ quantization_config=bnb_config,
177
+ device_map={"": 0}
178
+ )
179
+
180
+ model.config.use_cache = False
181
+ model.config.pretraining_tp = 1
182
+ model = prepare_model_for_kbit_training(model)
183
+
184
+ tokenizer = AutoTokenizer.from_pretrained(
185
+ "meta-llama/Meta-Llama-3-8B" if model_name == 'Llama3'
186
+ else "meta-llama/Llama-2-7b-chat-hf" if model_name == 'Llama2'
187
+ else "mistralai/Mistral-7B-Instruct-v0.2" if model_name == 'Mistral'
188
+ else "deepseek-ai/deepseek-llm-7b-base",
189
+ use_fast=False
190
+ )
191
+
192
+ tokenizer.add_special_tokens({'pad_token': '[PAD]'})
193
+ tokenizer.pad_token = tokenizer.eos_token
194
+
195
+ # LoRA Config
196
+ peft_config = LoraConfig(
197
+ lora_alpha=32, # Alpha value for LoRA, the higher the value, the more aggressive the sparsity
198
+ lora_dropout=0.05,
199
+ r=16, # Rank of the LoRA decomposition
200
+ target_modules= ['q_proj','k_proj','v_proj','o_proj','gate_proj','down_proj','up_proj','lm_head'],
201
+ bias="none",
202
+ task_type="CAUSAL_LM",
203
+ )
204
+
205
+ # Create adapter directory
206
+ os.makedirs(new_model, exist_ok=True)
207
+
208
+ # Store loss for visualization
209
+ class LoggingCallback(TrainerCallback):
210
+ def on_log(self, args, state, control, logs=None, **kwargs):
211
+ if logs:
212
+ output_log_file = os.path.join(args.output_dir, "train_results.json")
213
+ with open(output_log_file, "a") as writer:
214
+ writer.write(json.dumps(logs) + "\n")
215
+
216
+ # Training arguments
217
+ training_arguments = TrainingArguments(
218
+ num_train_epochs=num_epochs,
219
+ evaluation_strategy="steps",
220
+ save_steps=-1,
221
+ save_total_limit=1,
222
+ logging_steps=500,
223
+ eval_steps=500,
224
+ learning_rate=learning_rate_,
225
+ weight_decay=0.001,
226
+ per_device_train_batch_size=script_args.batch_size,
227
+ max_steps=-1,
228
+ gradient_accumulation_steps=4,
229
+ per_device_eval_batch_size=script_args.batch_size,
230
+ output_dir=new_model,
231
+ max_grad_norm=0.3,
232
+ warmup_ratio=0.03,
233
+ lr_scheduler_type="constant",
234
+ optim=script_args.optimizer_name,
235
+ fp16=True,
236
+ logging_strategy="steps",
237
+ log_level='info'
238
+ )
239
+
240
+ trainer = SFTTrainer(
241
+ model=model,
242
+ tokenizer=tokenizer,
243
+ train_dataset=train_dataset,
244
+ eval_dataset=test_dataset,
245
+ dataset_text_field="text",
246
+ peft_config=peft_config,
247
+ max_seq_length=script_args.max_seq_length,
248
+ args=training_arguments,
249
+ callbacks=[LoggingCallback()]
250
+ )
251
+
252
+ trainer.train()
253
+ trainer.model.save_pretrained(new_model)
254
+ trainer.tokenizer.save_pretrained(new_model)
255
+ print('Done in ', time.time() - start_time)
256
+
257
+ # Save plots
258
+ epochs, train_losses, eval_losses = [], [], []
259
+
260
+ # Load evaluation results
261
+ eval_results_file = os.path.join(new_model, "train_results.json")
262
+ with open(eval_results_file, "r") as f:
263
+ for line in f:
264
+ data = json.loads(line)
265
+ if 'epoch' in data:
266
+ epoch = data['epoch']
267
+ if 'loss' in data:
268
+ train_losses.append(data['loss'])
269
+ epochs.append(epoch)
270
+ if 'eval_loss' in data:
271
+ eval_losses.append(data['eval_loss'])
272
+ if epoch not in epochs:
273
+ epochs.append(epoch)
274
+
275
+ # Plotting
276
+ plt.figure(figsize=(10, 5))
277
+ plt.plot(epochs[:len(train_losses)], train_losses, label='Train Loss', color='blue')
278
+ plt.plot(epochs[:len(eval_losses)], eval_losses, label='Eval Loss', color='red')
279
+ plt.xlabel('Epoch')
280
+ plt.ylabel('Loss')
281
+ plt.title('Training and Evaluation Loss', fontsize=10)
282
+ plt.legend()
283
+ plt.tight_layout()
284
+
285
+ # Save the plot
286
+ plot_path = os.path.join(new_model, 'training_evaluation_loss_plot.png')
287
+ plt.savefig(plot_path)
288
+ plt.close()
289
+
290
+ print(f"Plot saved in the current directory as 'training_evaluation_loss_plot.png'.")
291
+
292
+
293
+
294
+
295
+
296
+
297
+
298
+
Reproducibility/Inference_sum.py ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ # Ensure the HF_HOME environment variable points to your desired cache location
3
+ # os.environ["HF_TOKEN"] = "your_hugging_face_token_here" # Replace with your Hugging Face token
4
+ cache_dir = '/network/rit/lab/Lai_ReSecureAI/kiel/wmm'
5
+ os.environ['HF_HOME'] = cache_dir
6
+
7
+ import time
8
+ import torch
9
+ from transformers import AutoModelForCausalLM, AutoTokenizer
10
+ from peft import PeftModel
11
+ import json
12
+
13
+ # Clear cache
14
+ torch.cuda.empty_cache()
15
+ device = "cuda" if torch.cuda.is_available() else "cpu"
16
+
17
+ # Initialize parameters
18
+ model_name = "DeepSeek" #"Llama3" #'DeepSeek' #
19
+ WM = "SafeSeal"
20
+ num_data = 20000
21
+ num_epochs = 5
22
+ learning_rate_ = 1e-5
23
+ N = 1000
24
+
25
+ # Print parameters
26
+ # Print parameters
27
+ print(f'Device: {device}')
28
+ print(f'Model: {model_name}')
29
+ print(f'WM: {WM}')
30
+ print(f'Number of data: {num_data}')
31
+ print(f'Number of epochs: {num_epochs}')
32
+ print(f'Learning rate: {learning_rate_}')
33
+ print(f'Number of generated data: {N}')
34
+
35
+ # Base model
36
+ if model_name == 'Llama3':
37
+ LLM_name = "meta-llama/Meta-Llama-3-8B"
38
+ base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B",
39
+ low_cpu_mem_usage=True,
40
+ return_dict=True,
41
+ torch_dtype=torch.float16,
42
+ device_map={"": 0})
43
+ elif model_name == 'DeepSeek':
44
+ LLM_name = "deepseek-ai/deepseek-llm-7b-base"
45
+ base_model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-llm-7b-base",
46
+ low_cpu_mem_usage=True,
47
+ return_dict=True,
48
+ torch_dtype=torch.float16,
49
+ device_map={"": 0})
50
+
51
+ # Adapter path
52
+ def get_adapter_path(model_name,WM, num_epochs, learning_rate_, num_data):
53
+ return f"./adversary_models/{model_name}_{WM}_epoch{num_epochs}_lr{learning_rate_}_data{num_data}_"
54
+ #return f"/network/rit/lab/Lai_ReSecureAI/phung/adversary_models/{model_name}_epoch{num_epochs}_lr{learning_rate_}_K{K}_Threshold{Threshold}_data{num_data}_testing_batch{batch_no}_"
55
+
56
+ adapter = get_adapter_path(model_name,WM, num_epochs, learning_rate_, num_data)
57
+
58
+ # Check if the adapter path exists
59
+ print(f'Path to adapter: {adapter}')
60
+ if os.path.exists(adapter):
61
+ print("Path exists.")
62
+ else:
63
+ print("Path does not exist.")
64
+
65
+ # Merge the base model and adapter
66
+ model = PeftModel.from_pretrained(base_model, adapter)
67
+ print("Model loaded successfully.")
68
+ model = model.merge_and_unload()
69
+ print("Model merged successfully.")
70
+ model.to(device)
71
+
72
+ # Initialize the tokenizer and model
73
+ tokenizer = AutoTokenizer.from_pretrained(LLM_name, cache_dir=cache_dir,use_fast=False)
74
+ # # Add special tokens
75
+ # tokenizer.add_special_tokens({'pad_token': '[PAD]'})
76
+ # tokenizer.pad_token = tokenizer.eos_token
77
+
78
+ # Load the data
79
+ max_output_tokens=90
80
+ min_output_tokens=10
81
+ data_link = "/network/rit/lab/Lai_ReSecureAI/kiel/New_WM/Summarization/cnn.json"
82
+ output_results = []
83
+ input_counter = 0
84
+ saving_freq = 10
85
+ data = "test"
86
+ output_name = f"Adversary_{model_name}_{WM}_Summarization_{data}_{num_data}_{num_epochs}_{learning_rate_}_"
87
+
88
+ # torch.clear_cache()
89
+
90
+ def text_summarize(input_text, model, tokenizer, max_output_tokens , min_output_tokens):
91
+ #prompt = f"{input_text}\nThe summary is:"
92
+ prompt = f"""
93
+ Input: The CNN/Daily Mail dataset is one of the most widely used datasets for text summarization.
94
+ It contains news articles and their corresponding highlights, which act as summaries.
95
+ State-of-the-art models often use this dataset to fine-tune their summarization capabilities.
96
+
97
+ Example Summary: The CNN/Daily Mail dataset is commonly used for training summarization models with news articles and highlights.
98
+
99
+ Now summarize the following text with maximum 60 words: {input_text}
100
+ The summary is:"""
101
+ # prompt = f"""
102
+ # Now summarize the following text with maximum 60 words: {input_text}
103
+ # The summary is:"""
104
+ inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=True)
105
+ inputs_tokens = inputs['input_ids'].cuda()
106
+
107
+ output = model.generate(
108
+ inputs_tokens,
109
+ max_new_tokens=max_output_tokens,
110
+ min_new_tokens=min_output_tokens,
111
+ do_sample=True,
112
+ temperature=0.9,
113
+ top_k=50,
114
+ eos_token_id=tokenizer.eos_token_id,
115
+ pad_token_id=tokenizer.eos_token_id, # Prevents truncation issues
116
+ repetition_penalty=1.2 # Ensures complete sentence termination
117
+ )
118
+
119
+ summary = tokenizer.decode(output[0], skip_special_tokens=True)
120
+ return summary.split("The summary is:")[-1].strip()
121
+
122
+ with open(data_link, "r", encoding="utf-8") as f:
123
+ data_subset = json.load(f)
124
+
125
+ # Filter test data
126
+ test_data = [sample for sample in data_subset if sample["type"] == data]
127
+
128
+ # Testing loop
129
+ for i, sample in enumerate(test_data[:N]):
130
+ print(f"Processed {i+1}/{len(test_data[:N])}")
131
+ text = sample["article"]
132
+ summary = text_summarize(text, model, tokenizer, max_output_tokens , min_output_tokens)
133
+
134
+ # Store the input and output in a dictionary
135
+ data_dict = {
136
+ "id": sample["id"],
137
+ "article": sample["article"],
138
+ "highlights": sample["highlights"],
139
+ "summary": summary,
140
+ "type": data
141
+ }
142
+
143
+ output_results.append(data_dict)
144
+ input_counter += 1
145
+
146
+ # Save the results freqently
147
+ if input_counter % saving_freq == 0:
148
+ # Check if the file exits
149
+ if os.path.isfile(output_name + "_" + str(input_counter-saving_freq) + ".json"):
150
+ os.remove(output_name + "_" + str(input_counter-saving_freq) + ".json")
151
+ with open(output_name + "_" + str(input_counter) + ".json", "w", encoding="utf-8") as json_file:
152
+ json.dump(output_results, json_file, indent=4)
153
+
154
+ print(f"Summarization complete. Results saved to {output_name}.")
Reproducibility/README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Reproducibility Codes
2
+
3
+ This folder contains the Python scripts needed to reproduce the watermark performance results shown in the leaderboard.
4
+
5
+ ## Scripts Overview
6
+
7
+ ### Dataset Preparation
8
+ - **`C4_dataset_download.py`**: Downloads and prepares the C4 dataset for watermark evaluation
9
+ - **`CNN_dataset_download.py`**: Downloads and prepares the CNN/DailyMail dataset for evaluation
10
+
11
+ ### Model Training & Inference
12
+ - **`Finetune_sum.py`**: Fine-tunes language models for watermark evaluation
13
+ - **`Inference_sum.py`**: Performs inference with watermarked models to generate test data
14
+
15
+ ### Evaluation Metrics
16
+ - **`BERT_score.py`**: Computes BERT scores for text quality evaluation
17
+ - **`Entity_similarity_score.py`**: Calculates entity similarity scores for watermark detection
18
+ - **`Attack_dipper.py`**: Implements watermark removal attacks for robustness testing
19
+
20
+ ## Usage Instructions
21
+
22
+ 1. **Environment Setup**: Ensure you have the required dependencies installed (transformers, datasets, etc.)
23
+
24
+ 2. **Dataset Preparation**: Run the dataset download scripts first
25
+ ```bash
26
+ python C4_dataset_download.py
27
+ python CNN_dataset_download.py
28
+ ```
29
+
30
+ 3. **Model Training**: Fine-tune your models
31
+ ```bash
32
+ python Finetune_sum.py
33
+ ```
34
+
35
+ 4. **Inference**: Generate watermarked text
36
+ ```bash
37
+ python Inference_sum.py
38
+ ```
39
+
40
+ 5. **Evaluation**: Run the evaluation metrics
41
+ ```bash
42
+ python BERT_score.py
43
+ python Entity_similarity_score.py
44
+ python Attack_dipper.py
45
+ ```
46
+
47
+ ## Requirements
48
+
49
+ - Python 3.8+
50
+ - PyTorch
51
+ - Transformers library
52
+ - Datasets library
53
+ - Other dependencies as specified in each script
54
+
55
+ ## Notes
56
+
57
+ - Modify the configuration parameters in each script according to your setup
58
+ - Ensure you have sufficient computational resources for training and evaluation
59
+ - Results may vary based on random seeds and hardware differences
60
+
61
+ For detailed instructions on each metric evaluation, refer to the main guidelines in the leaderboard application.
app.py ADDED
@@ -0,0 +1,1097 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import json
3
+ import os
4
+ import pandas as pd
5
+ import plotly.express as px
6
+ import plotly.graph_objects as go
7
+ from datetime import datetime
8
+ from plotly.subplots import make_subplots
9
+
10
+ # Load leaderboard data
11
+ def load_leaderboard_data():
12
+ try:
13
+ with open('leaderboard.json', 'r') as f:
14
+ return json.load(f)
15
+ except:
16
+ return []
17
+
18
+ # Filter data based on model and metric
19
+ def filter_data(data, model, metric):
20
+ filtered = []
21
+ for item in data:
22
+ if item.get('model') == model:
23
+ if metric == "Attack-free":
24
+ if item.get('normalizedUtility') is not None and item.get('detectionRate') is not None:
25
+ filtered.append({
26
+ 'name': item.get('name', ''),
27
+ 'model': item.get('model', ''),
28
+ 'normalizedUtility': item.get('normalizedUtility', 0),
29
+ 'detectionRate': item.get('detectionRate', 0)
30
+ })
31
+ elif metric == "Watermark Removal":
32
+ if (item.get('absoluteUtilityDegregation') is not None and
33
+ item.get('removal_detectionRate') is not None):
34
+ filtered.append({
35
+ 'name': item.get('name', ''),
36
+ 'model': item.get('model', ''),
37
+ 'absoluteUtilityDegregation': item.get('absoluteUtilityDegregation', 0),
38
+ 'removal_detectionRate': item.get('removal_detectionRate', 0)
39
+ })
40
+ elif metric == "Stealing Attack":
41
+ if (item.get('adversaryBERTscore') is not None and
42
+ item.get('adversaryDetectionRate') is not None):
43
+ filtered.append({
44
+ 'name': item.get('name', ''),
45
+ 'model': item.get('model', ''),
46
+ 'adversaryBERTscore': item.get('adversaryBERTscore', 0),
47
+ 'adversaryDetectionRate': item.get('adversaryDetectionRate', 0)
48
+ })
49
+
50
+ # Sort by detection rate (descending)
51
+ if metric == "Attack-free":
52
+ filtered.sort(key=lambda x: x['detectionRate'], reverse=True)
53
+ elif metric == "Watermark Removal":
54
+ filtered.sort(key=lambda x: x['removal_detectionRate'], reverse=True)
55
+ else: # Stealing Attack
56
+ filtered.sort(key=lambda x: x['adversaryDetectionRate'], reverse=True)
57
+
58
+ return filtered
59
+
60
+ # Create scatter plot
61
+ def create_scatter_plot(data, metric):
62
+ if not data:
63
+ return go.Figure()
64
+
65
+ # Prepare data for plotting
66
+ x_data = []
67
+ y_data = []
68
+ names = []
69
+
70
+ for item in data:
71
+ names.append(item['name'])
72
+ if metric == "Attack-free":
73
+ x_data.append(item['normalizedUtility'])
74
+ y_data.append(item['detectionRate'])
75
+ elif metric == "Watermark Removal":
76
+ x_data.append(item['absoluteUtilityDegregation'])
77
+ y_data.append(item['removal_detectionRate'])
78
+ else: # Stealing Attack
79
+ x_data.append(item['adversaryBERTscore'])
80
+ y_data.append(item['adversaryDetectionRate'])
81
+
82
+ # Create scatter plot
83
+ fig = go.Figure()
84
+
85
+ # Add scatter points
86
+ fig.add_trace(go.Scatter(
87
+ x=x_data,
88
+ y=y_data,
89
+ mode='markers+text',
90
+ marker=dict(
91
+ size=12,
92
+ color='#3B82F6',
93
+ line=dict(width=2, color='white')
94
+ ),
95
+ text=names,
96
+ textposition='top center',
97
+ textfont=dict(size=10, color='#374151'),
98
+ hovertemplate='<b>%{text}</b><br>' +
99
+ ('Normalized Utility: %{x:.3f}<br>' if metric == "Attack-free" else
100
+ 'Abs Utility Degradation: %{x:.3f}<br' if metric == "Watermark Removal" else
101
+ 'Adversary BERT Score: %{x:.3f}<br>') +
102
+ ('Detection Rate: %{y:.3f}%<br>' if metric != "Stealing Attack" else
103
+ 'Adversary Detection Rate: %{y:.3f}%<br>') +
104
+ '<extra></extra>'
105
+ ))
106
+
107
+ # Set axis labels
108
+ if metric == "Attack-free":
109
+ x_title = "Normalized Utility"
110
+ y_title = "Detection Rate (%)"
111
+ elif metric == "Watermark Removal":
112
+ x_title = "Absolute Utility Degradation"
113
+ y_title = "Removal Detection Rate (%)"
114
+ else: # Stealing Attack
115
+ x_title = "Adversary BERT Score"
116
+ y_title = "Adversary Detection Rate (%)"
117
+
118
+ fig.update_layout(
119
+ title=f"{metric} Performance Scatter Plot",
120
+ xaxis_title=x_title,
121
+ yaxis_title=y_title,
122
+ font=dict(size=12, color='#374151'),
123
+ plot_bgcolor='white',
124
+ paper_bgcolor='white',
125
+ xaxis=dict(
126
+ gridcolor='lightgray',
127
+ showgrid=True,
128
+ zeroline=False
129
+ ),
130
+ yaxis=dict(
131
+ gridcolor='lightgray',
132
+ showgrid=True,
133
+ zeroline=False
134
+ ),
135
+ margin=dict(l=60, r=60, t=80, b=60)
136
+ )
137
+
138
+ return fig
139
+
140
+ # Create table data with heatmap styling
141
+ def create_table_data(data, metric):
142
+ if not data:
143
+ return pd.DataFrame()
144
+
145
+ table_data = []
146
+ for i, item in enumerate(data, 1):
147
+ row = {'Rank': i, 'Watermark': item['name']}
148
+
149
+ if metric == "Attack-free":
150
+ row['Normalized Utility ↑'] = f"{item['normalizedUtility']:.3f}"
151
+ row['Detection Rate (%) ↑'] = f"{item['detectionRate']:.3f}"
152
+ elif metric == "Watermark Removal":
153
+ row['Abs Utility Degradation ↑'] = f"{item['absoluteUtilityDegregation']:.3f}"
154
+ row['Removal Detection Rate (%) ↑'] = f"{item['removal_detectionRate']:.3f}"
155
+ else: # Stealing Attack
156
+ row['Adversary BERT Score ↑'] = f"{item['adversaryBERTscore']:.3f}"
157
+ row['Adversary Detection Rate (%) ↑'] = f"{item['adversaryDetectionRate']:.3f}"
158
+
159
+ table_data.append(row)
160
+
161
+ return pd.DataFrame(table_data)
162
+
163
+ # Create table data with green arrows and reference links
164
+ def create_table_data(data, metric):
165
+ if not data:
166
+ return pd.DataFrame()
167
+
168
+ table_data = []
169
+ for i, item in enumerate(data, 1):
170
+ watermark_name = item['name']
171
+ paper_link = item.get('paperLink')
172
+ model = item.get('model', 'N/A')
173
+
174
+ # Create reference link if paper link exists (smaller text)
175
+ if paper_link:
176
+ reference_link = f'<a href="{paper_link}" target="_blank" style="color: #3B82F6; text-decoration: underline; font-size: 0.8em;">📄 Paper</a>'
177
+ else:
178
+ reference_link = '-'
179
+
180
+ row = {
181
+ 'Watermark': watermark_name
182
+ }
183
+
184
+ if metric == "Attack-free":
185
+ row['Normalized Utility ↑'] = f"{item['normalizedUtility']:.3f}"
186
+ row['Detection Rate (%) ↑'] = f"{item['detectionRate']:.3f}"
187
+ elif metric == "Watermark Removal":
188
+ row['Abs Utility Degradation ↑'] = f"{item['absoluteUtilityDegregation']:.3f}"
189
+ row['Removal Detection Rate (%) ↑'] = f"{item['removal_detectionRate']:.3f}"
190
+ else: # Stealing Attack
191
+ row['Adversary BERT Score ↑'] = f"{item['adversaryBERTscore']:.3f}"
192
+ row['Adversary Detection Rate (%) ↑'] = f"{item['adversaryDetectionRate']:.3f}"
193
+
194
+ # Add Reference column at the end
195
+ row['Reference'] = reference_link
196
+
197
+ table_data.append(row)
198
+
199
+ return pd.DataFrame(table_data)
200
+
201
+ # Update interface based on selections
202
+ def update_interface(model, metric):
203
+ data = load_leaderboard_data()
204
+ filtered_data = filter_data(data, model, metric)
205
+
206
+ # Create scatter plot
207
+ scatter_plot = create_scatter_plot(filtered_data, metric)
208
+
209
+ # Create table with green arrows
210
+ table_data = create_table_data(filtered_data, metric)
211
+
212
+ return scatter_plot, table_data
213
+
214
+ # Handle form submission
215
+ def submit_watermark_data(name, model, paper_link, normalized_utility, detection_rate,
216
+ absolute_utility_degradation, removal_detection_rate,
217
+ adversary_bert_score, adversary_detection_rate):
218
+ """Handle watermark data submission"""
219
+
220
+ # Validation
221
+ if not name or not name.strip():
222
+ return "❌ Error: Watermark name is required", gr.update()
223
+
224
+ if not model:
225
+ return "❌ Error: Model selection is required", gr.update()
226
+
227
+ # Validate paper link if provided
228
+ if paper_link and paper_link.strip():
229
+ paper_link = paper_link.strip()
230
+ if not (paper_link.startswith('http://') or paper_link.startswith('https://')):
231
+ return "❌ Error: Paper link must start with http:// or https://", gr.update()
232
+ else:
233
+ paper_link = None
234
+
235
+ # Check what type of submission this is based on provided fields
236
+ has_attack_free_data = normalized_utility is not None and detection_rate is not None
237
+ has_removal_data = absolute_utility_degradation is not None and removal_detection_rate is not None
238
+ has_stealing_data = adversary_bert_score is not None and adversary_detection_rate is not None
239
+
240
+ # At least one complete set of metrics must be provided
241
+ if not has_attack_free_data and not has_removal_data and not has_stealing_data:
242
+ return "❌ Error: Please provide at least one complete set of metrics:\n• Attack-free: Normalized Utility + Detection Rate\n• Watermark Removal: Absolute Utility Degradation + Removal Detection Rate\n• Stealing Attack: Adversary BERT Score + Adversary Detection Rate", gr.update()
243
+
244
+ # Validate Attack-free metrics if provided
245
+ if has_attack_free_data:
246
+ if normalized_utility <= 0 or normalized_utility > 1.0:
247
+ return "❌ Error: Normalized Utility must be between 0.000 and 1.000", gr.update()
248
+ if detection_rate < 0.0 or detection_rate > 100.0:
249
+ return "❌ Error: Detection Rate must be between 0.000 and 100.000", gr.update()
250
+
251
+ # Validate Watermark Removal metrics if provided
252
+ if has_removal_data:
253
+ if absolute_utility_degradation <= 0 or absolute_utility_degradation > 1.0:
254
+ return "❌ Error: Absolute Utility Degradation must be between 0.000 and 1.000", gr.update()
255
+ if removal_detection_rate < 0.0 or removal_detection_rate > 100.0:
256
+ return "❌ Error: Removal Detection Rate must be between 0.000 and 100.000", gr.update()
257
+
258
+ # Validate Stealing Attack metrics if provided
259
+ if has_stealing_data:
260
+ if adversary_bert_score <= 0 or adversary_bert_score > 1.0:
261
+ return "❌ Error: Adversary BERT Score must be between 0.000 and 1.000", gr.update()
262
+ if adversary_detection_rate < 0.0 or adversary_detection_rate > 100.0:
263
+ return "❌ Error: Adversary Detection Rate must be between 0.000 and 100.000", gr.update()
264
+
265
+ # Validate partial adversary data (if one is provided, both are required)
266
+ has_partial_adversary = (adversary_bert_score is not None and adversary_bert_score > 0) or \
267
+ (adversary_detection_rate is not None and adversary_detection_rate > 0)
268
+
269
+ if has_partial_adversary and not has_stealing_data:
270
+ return "❌ Error: If you provide one adversary metric, you must provide both Adversary BERT Score and Adversary Detection Rate", gr.update()
271
+
272
+ # Create new entry - only include provided values, don't set missing ones to 0
273
+ new_entry = {
274
+ "name": name.strip(),
275
+ "model": model,
276
+ "normalizedUtility": normalized_utility,
277
+ "detectionRate": detection_rate
278
+ }
279
+
280
+ # Add paper link if provided
281
+ if paper_link:
282
+ new_entry["paperLink"] = paper_link
283
+
284
+ # Only add optional metrics if they were provided
285
+ if absolute_utility_degradation is not None:
286
+ new_entry["absoluteUtilityDegregation"] = absolute_utility_degradation
287
+ if removal_detection_rate is not None:
288
+ new_entry["removal_detectionRate"] = removal_detection_rate
289
+ if adversary_bert_score is not None:
290
+ new_entry["adversaryBERTscore"] = adversary_bert_score
291
+ if adversary_detection_rate is not None:
292
+ new_entry["adversaryDetectionRate"] = adversary_detection_rate
293
+
294
+ # Load existing approved data to check for duplicates
295
+ try:
296
+ with open('leaderboard.json', 'r') as f:
297
+ approved_data = json.load(f)
298
+ except:
299
+ approved_data = []
300
+
301
+ # Check for duplicate names in approved data
302
+ for entry in approved_data:
303
+ if entry.get('name') == name.strip() and entry.get('model') == model:
304
+ return f"❌ Error: A watermark named '{name.strip()}' already exists for {model}", gr.update()
305
+
306
+ # Load pending submissions to check for duplicates there too
307
+ try:
308
+ with open('pending_submissions.json', 'r') as f:
309
+ pending_data = json.load(f)
310
+ except:
311
+ pending_data = []
312
+
313
+ # Check for duplicate names in pending data
314
+ for entry in pending_data:
315
+ if entry.get('name') == name.strip() and entry.get('model') == model:
316
+ return f"❌ Error: A watermark named '{name.strip()}' is already pending approval for {model}", gr.update()
317
+
318
+ # Add submission timestamp and status
319
+ new_entry['submitted_at'] = datetime.now().isoformat()
320
+ new_entry['status'] = 'pending'
321
+ new_entry['submission_id'] = f"{name.strip()}_{model}_{int(datetime.now().timestamp())}"
322
+
323
+ # Add to pending submissions instead of approved data
324
+ pending_data.append(new_entry)
325
+
326
+ # Save pending submissions
327
+ try:
328
+ with open('pending_submissions.json', 'w') as f:
329
+ json.dump(pending_data, f, indent=2)
330
+
331
+ # Update the interface with current approved data only
332
+ filtered_data = filter_data(approved_data, model, "Attack-free")
333
+ scatter_plot = create_scatter_plot(filtered_data, "Attack-free")
334
+ table_data = create_table_data(filtered_data, "Attack-free")
335
+
336
+ success_msg = f"✅ Successfully submitted '{name.strip()}' for {model} for approval! Your submission will be reviewed by the administrator before appearing on the leaderboard."
337
+ return success_msg, scatter_plot, table_data
338
+
339
+ except Exception as e:
340
+ return f"❌ Error saving submission: {str(e)}", gr.update()
341
+
342
+ # Clear form function
343
+ def clear_form():
344
+ return (None, None, None, None, None, None, None, None, None)
345
+
346
+ # Owner approval functions
347
+ def load_pending_submissions():
348
+ """Load pending submissions for owner review"""
349
+ try:
350
+ with open('pending_submissions.json', 'r') as f:
351
+ pending_data = json.load(f)
352
+
353
+ if not pending_data:
354
+ return pd.DataFrame(columns=["ID", "Name", "Model", "Paper Link", "Attack-free Utility", "Attack-free Detection",
355
+ "Removal Degradation", "Removal Detection", "Adversary BERT", "Adversary Detection", "Submitted At"])
356
+
357
+ # Format data for display with all fields
358
+ formatted_data = []
359
+ for entry in pending_data:
360
+ watermark_name = entry.get('name', 'N/A')
361
+ paper_link = entry.get('paperLink', '-')
362
+ model = entry.get('model', 'N/A')
363
+
364
+ # Format all metric fields
365
+ formatted_entry = {
366
+ "ID": entry.get('submission_id', 'N/A'),
367
+ "Name": watermark_name,
368
+ "Model": model,
369
+ "Paper Link": paper_link if paper_link != '-' else '-',
370
+ "Attack-free Utility": f"{entry.get('normalizedUtility', 0):.3f}" if entry.get('normalizedUtility') is not None else '-',
371
+ "Attack-free Detection": f"{entry.get('detectionRate', 0):.3f}" if entry.get('detectionRate') is not None else '-',
372
+ "Removal Degradation": f"{entry.get('absoluteUtilityDegregation', 0):.3f}" if entry.get('absoluteUtilityDegregation') is not None else '-',
373
+ "Removal Detection": f"{entry.get('removal_detectionRate', 0):.3f}" if entry.get('removal_detectionRate') is not None else '-',
374
+ "Adversary BERT": f"{entry.get('adversaryBERTscore', 0):.3f}" if entry.get('adversaryBERTscore') is not None else '-',
375
+ "Adversary Detection": f"{entry.get('adversaryDetectionRate', 0):.3f}" if entry.get('adversaryDetectionRate') is not None else '-',
376
+ "Submitted At": entry.get('submitted_at', 'N/A')[:19] if entry.get('submitted_at') else 'N/A', # Show only date and time
377
+ }
378
+ formatted_data.append(formatted_entry)
379
+
380
+ return pd.DataFrame(formatted_data)
381
+
382
+ except Exception as e:
383
+ print(f"Error loading pending submissions: {e}")
384
+ return pd.DataFrame(columns=["ID", "Name", "Model", "Paper Link", "Attack-free Utility", "Attack-free Detection",
385
+ "Removal Degradation", "Removal Detection", "Adversary BERT", "Adversary Detection", "Submitted At"])
386
+
387
+ def approve_submission(submission_id, admin_password):
388
+ """Approve a pending submission"""
389
+ # Check admin password
390
+ if admin_password != "admin123": # You can change this password
391
+ return "❌ Access denied: Invalid admin password", gr.update()
392
+
393
+ try:
394
+ # Load pending submissions from file (not from the formatted function)
395
+ try:
396
+ with open('pending_submissions.json', 'r') as f:
397
+ pending_data = json.load(f)
398
+ except:
399
+ pending_data = []
400
+
401
+ # Find and remove the submission
402
+ approved_entry = None
403
+ for i, entry in enumerate(pending_data):
404
+ if entry.get('submission_id') == submission_id:
405
+ approved_entry = pending_data.pop(i)
406
+ break
407
+
408
+ if not approved_entry:
409
+ return "❌ Submission not found", gr.update()
410
+
411
+ # Remove submission metadata
412
+ approved_entry.pop('submitted_at', None)
413
+ approved_entry.pop('status', None)
414
+ approved_entry.pop('submission_id', None)
415
+
416
+ # Load approved data
417
+ try:
418
+ with open('leaderboard.json', 'r') as f:
419
+ approved_data = json.load(f)
420
+ except:
421
+ approved_data = []
422
+
423
+ # Add to approved data
424
+ approved_data.append(approved_entry)
425
+
426
+ # Save approved data
427
+ with open('leaderboard.json', 'w') as f:
428
+ json.dump(approved_data, f, indent=2)
429
+
430
+ # Save updated pending data
431
+ with open('pending_submissions.json', 'w') as f:
432
+ json.dump(pending_data, f, indent=2)
433
+
434
+ return f"✅ Approved submission: {approved_entry.get('name', 'Unknown')}", load_pending_submissions()
435
+
436
+ except Exception as e:
437
+ return f"❌ Error approving submission: {str(e)}", gr.update()
438
+
439
+ def reject_submission(submission_id, admin_password):
440
+ """Reject a pending submission"""
441
+ # Check admin password
442
+ if admin_password != "admin123": # You can change this password
443
+ return "❌ Access denied: Invalid admin password", gr.update()
444
+
445
+ try:
446
+ # Load pending submissions from file (not from the formatted function)
447
+ try:
448
+ with open('pending_submissions.json', 'r') as f:
449
+ pending_data = json.load(f)
450
+ except:
451
+ pending_data = []
452
+
453
+ # Find and remove the submission
454
+ rejected_entry = None
455
+ for i, entry in enumerate(pending_data):
456
+ if entry.get('submission_id') == submission_id:
457
+ rejected_entry = pending_data.pop(i)
458
+ break
459
+
460
+ if not rejected_entry:
461
+ return "❌ Submission not found", gr.update()
462
+
463
+ # Save updated pending data
464
+ with open('pending_submissions.json', 'w') as f:
465
+ json.dump(pending_data, f, indent=2)
466
+
467
+ return f"❌ Rejected submission: {rejected_entry.get('name', 'Unknown')}", load_pending_submissions()
468
+
469
+ except Exception as e:
470
+ return f"❌ Error rejecting submission: {str(e)}", gr.update()
471
+
472
+ # Toggle add data section visibility
473
+ def toggle_add_data_section(section):
474
+ return gr.update(visible=not section.visible)
475
+
476
+ # Create the main interface
477
+ def create_interface():
478
+ # Custom CSS for better styling
479
+ css = """
480
+ .gradio-container {
481
+ max-width: 1200px !important;
482
+ margin: 0 auto !important;
483
+ background: linear-gradient(135deg, #f5f7fa 0%, #c3cfe2 100%);
484
+ min-height: 100vh;
485
+ }
486
+ .title {
487
+ text-align: center;
488
+ margin: 20px 0;
489
+ font-size: 3rem;
490
+ font-weight: bold;
491
+ background: linear-gradient(45deg, #667eea 0%, #764ba2 100%);
492
+ -webkit-background-clip: text;
493
+ -webkit-text-fill-color: transparent;
494
+ background-clip: text;
495
+ text-shadow: 2px 2px 4px rgba(0,0,0,0.1);
496
+ }
497
+ .subtitle {
498
+ text-align: center;
499
+ margin-bottom: 30px;
500
+ font-size: 1.3rem;
501
+ color: #4a5568;
502
+ font-weight: 500;
503
+ }
504
+ .controls {
505
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
506
+ padding: 30px;
507
+ border-radius: 15px;
508
+ margin-bottom: 25px;
509
+ box-shadow: 0 8px 32px rgba(0,0,0,0.1);
510
+ border: 1px solid rgba(255,255,255,0.2);
511
+ }
512
+ .controls label {
513
+ color: white !important;
514
+ font-weight: bold !important;
515
+ font-size: 1.2rem !important;
516
+ }
517
+ .controls .gr-radio {
518
+ background: rgba(255,255,255,0.1) !important;
519
+ border-radius: 10px !important;
520
+ padding: 12px !important;
521
+ }
522
+ .controls .gr-radio label {
523
+ color: white !important;
524
+ font-size: 1.1rem !important;
525
+ }
526
+ .controls h3 {
527
+ font-size: 1.4rem !important;
528
+ margin-bottom: 15px !important;
529
+ }
530
+ #highlighted-add-data {
531
+ background: linear-gradient(135deg, #E0F2FE 0%, #B3E5FC 100%) !important;
532
+ border: 2px solid #81D4FA !important;
533
+ border-radius: 15px !important;
534
+ box-shadow: 0 10px 40px rgba(129, 212, 250, 0.3) !important;
535
+ margin: 20px 0 !important;
536
+ }
537
+ #highlighted-add-data .gr-accordion-header {
538
+ background: linear-gradient(135deg, #81D4FA 0%, #4FC3F7 100%) !important;
539
+ color: white !important;
540
+ font-weight: bold !important;
541
+ font-size: 1.2rem !important;
542
+ padding: 15px 20px !important;
543
+ border-radius: 15px 15px 0 0 !important;
544
+ }
545
+ #highlighted-add-data .gr-accordion-content {
546
+ background: rgba(255,255,255,0.95) !important;
547
+ border-radius: 0 0 15px 15px !important;
548
+ padding: 25px !important;
549
+ }
550
+ .gr-button {
551
+ border-radius: 10px !important;
552
+ font-weight: bold !important;
553
+ transition: all 0.3s ease !important;
554
+ }
555
+ .gr-button:hover {
556
+ transform: translateY(-2px) !important;
557
+ box-shadow: 0 5px 15px rgba(0,0,0,0.2) !important;
558
+ }
559
+ .gr-plot {
560
+ border-radius: 15px !important;
561
+ box-shadow: 0 8px 32px rgba(0,0,0,0.1) !important;
562
+ background: white !important;
563
+ padding: 20px !important;
564
+ }
565
+ .gr-dataframe {
566
+ border-radius: 15px !important;
567
+ box-shadow: 0 8px 32px rgba(0,0,0,0.1) !important;
568
+ background: white !important;
569
+ overflow: hidden !important;
570
+ }
571
+ .gr-accordion {
572
+ border-radius: 15px !important;
573
+ box-shadow: 0 8px 32px rgba(0,0,0,0.1) !important;
574
+ background: white !important;
575
+ margin: 15px 0 !important;
576
+ }
577
+ .gr-accordion-header {
578
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%) !important;
579
+ color: white !important;
580
+ font-weight: bold !important;
581
+ padding: 15px 20px !important;
582
+ border-radius: 15px 15px 0 0 !important;
583
+ }
584
+ .gr-accordion-content {
585
+ background: rgba(255,255,255,0.95) !important;
586
+ border-radius: 0 0 15px 15px !important;
587
+ padding: 20px !important;
588
+ }
589
+ #submit-btn {
590
+ background: linear-gradient(135deg, #29B6F6 0%, #0288D1 100%) !important;
591
+ border: 2px solid #0277BD !important;
592
+ color: white !important;
593
+ font-weight: bold !important;
594
+ font-size: 1.1rem !important;
595
+ padding: 15px 30px !important;
596
+ border-radius: 12px !important;
597
+ box-shadow: 0 8px 25px rgba(41, 182, 246, 0.4) !important;
598
+ transition: all 0.3s ease !important;
599
+ }
600
+ #submit-btn:hover {
601
+ background: linear-gradient(135deg, #0288D1 0%, #0277BD 100%) !important;
602
+ transform: translateY(-3px) !important;
603
+ box-shadow: 0 12px 35px rgba(41, 182, 246, 0.6) !important;
604
+ }
605
+ #owner-controls {
606
+ background: linear-gradient(135deg, #FFE0E0 0%, #FFCDD2 100%) !important;
607
+ border: 2px solid #FF5722 !important;
608
+ border-radius: 15px !important;
609
+ box-shadow: 0 10px 40px rgba(255, 87, 34, 0.3) !important;
610
+ margin: 20px 0 !important;
611
+ }
612
+ #owner-controls .gr-accordion-header {
613
+ background: linear-gradient(135deg, #FF5722 0%, #D32F2F 100%) !important;
614
+ color: white !important;
615
+ font-weight: bold !important;
616
+ font-size: 1.2rem !important;
617
+ padding: 15px 20px !important;
618
+ border-radius: 15px 15px 0 0 !important;
619
+ }
620
+ #owner-controls .gr-accordion-content {
621
+ background: rgba(255,255,255,0.95) !important;
622
+ border-radius: 0 0 15px 15px !important;
623
+ padding: 25px !important;
624
+ }
625
+ #approve-btn {
626
+ background: linear-gradient(135deg, #4CAF50 0%, #2E7D32 100%) !important;
627
+ border: 2px solid #388E3C !important;
628
+ color: white !important;
629
+ font-weight: bold !important;
630
+ font-size: 1.1rem !important;
631
+ padding: 15px 30px !important;
632
+ border-radius: 12px !important;
633
+ box-shadow: 0 8px 25px rgba(76, 175, 80, 0.4) !important;
634
+ transition: all 0.3s ease !important;
635
+ }
636
+ #approve-btn:hover {
637
+ background: linear-gradient(135deg, #2E7D32 0%, #1B5E20 100%) !important;
638
+ transform: translateY(-3px) !important;
639
+ box-shadow: 0 12px 35px rgba(76, 175, 80, 0.6) !important;
640
+ }
641
+ #reject-btn {
642
+ background: linear-gradient(135deg, #F44336 0%, #C62828 100%) !important;
643
+ border: 2px solid #D32F2F !important;
644
+ color: white !important;
645
+ font-weight: bold !important;
646
+ font-size: 1.1rem !important;
647
+ padding: 15px 30px !important;
648
+ border-radius: 12px !important;
649
+ box-shadow: 0 8px 25px rgba(244, 67, 54, 0.4) !important;
650
+ transition: all 0.3s ease !important;
651
+ }
652
+ #reject-btn:hover {
653
+ background: linear-gradient(135deg, #C62828 0%, #B71C1C 100%) !important;
654
+ transform: translateY(-3px) !important;
655
+ box-shadow: 0 12px 35px rgba(244, 67, 54, 0.6) !important;
656
+ }
657
+ #guideline-section {
658
+ background: linear-gradient(135deg, #E8F5E8 0%, #C8E6C9 100%) !important;
659
+ border: 2px solid #4CAF50 !important;
660
+ border-radius: 15px !important;
661
+ box-shadow: 0 10px 40px rgba(76, 175, 80, 0.3) !important;
662
+ margin: 20px 0 !important;
663
+ }
664
+ #guideline-section .gr-accordion-header {
665
+ background: linear-gradient(135deg, #4CAF50 0%, #2E7D32 100%) !important;
666
+ color: white !important;
667
+ font-weight: bold !important;
668
+ font-size: 1.2rem !important;
669
+ padding: 15px 20px !important;
670
+ border-radius: 15px 15px 0 0 !important;
671
+ }
672
+ #guideline-section .gr-accordion-content {
673
+ background: rgba(255,255,255,0.95) !important;
674
+ border-radius: 0 0 15px 15px !important;
675
+ padding: 25px !important;
676
+ }
677
+ """
678
+
679
+ with gr.Blocks(css=css, title="Watermark Leaderboard for LLMs") as demo:
680
+ # Header
681
+ gr.HTML("""
682
+ <div class="title">
683
+ 🏆 Watermark Leaderboard for LLMs 🏆
684
+ </div>
685
+ <div class="subtitle">
686
+ 📊 Interactive leaderboard for comparing watermark performance across different models and evaluation settings
687
+ </div>
688
+ """)
689
+
690
+ # Controls
691
+ with gr.Row():
692
+ with gr.Column(scale=1):
693
+ gr.HTML("<div style='text-align: center; margin-bottom: 15px;'><h3 style='color: #667eea; margin: 0; font-weight: bold;'>🤖 Model Selection</h3></div>")
694
+ model_selector = gr.Radio(
695
+ choices=["LLaMA3", "DeepSeek"],
696
+ value="LLaMA3",
697
+ label="Model",
698
+ info="Select the model to display"
699
+ )
700
+ with gr.Column(scale=1):
701
+ gr.HTML("<div style='text-align: center; margin-bottom: 15px;'><h3 style='color: #667eea; margin: 0; font-weight: bold;'>⚙️ Evaluation Setting</h3></div>")
702
+ metric_selector = gr.Radio(
703
+ choices=["Attack-free", "Watermark Removal", "Stealing Attack"],
704
+ value="Attack-free",
705
+ label="Setting",
706
+ info="Select the evaluation setting"
707
+ )
708
+
709
+
710
+ # Add Your Data Section (Highlighted)
711
+ with gr.Accordion("🚀 Add Your Data to the Leaderboard", open=False, elem_id="highlighted-add-data"):
712
+ gr.HTML("""
713
+ <div style='text-align: center; margin-bottom: 20px;'>
714
+ <h2 style='color: #0277BD; margin: 0; font-size: 1.5rem;'>📝 Submit Your Watermark Performance Results</h2>
715
+ <p style='color: #374151; margin: 10px 0 0 0;'>Contribute to the community by sharing your watermark evaluation results</p>
716
+ </div>
717
+ <div style='background: #E3F2FD; border: 1px solid #2196F3; border-radius: 8px; padding: 15px; margin-bottom: 20px;'>
718
+ <h4 style='color: #1976D2; margin: 0 0 10px 0;'>📋 Submission Requirements</h4>
719
+ <p style='color: #374151; margin: 0 0 8px 0;'>Provide at least one complete set of metrics:</p>
720
+ <ul style='color: #374151; margin: 0; padding-left: 20px;'>
721
+ <li><strong>Attack-free:</strong> Normalized Utility + Detection Rate</li>
722
+ <li><strong>Watermark Removal:</strong> Absolute Utility Degradation + Removal Detection Rate</li>
723
+ <li><strong>Stealing Attack:</strong> Adversary BERT Score + Adversary Detection Rate</li>
724
+ </ul>
725
+ </div>
726
+ """)
727
+ with gr.Row():
728
+ with gr.Column(scale=1):
729
+ # Basic Information
730
+ gr.HTML("<div style='text-align: center; margin-bottom: 15px;'><h3 style='color: #0277BD; margin: 0;'>📋 Basic Information</h3></div>")
731
+ watermark_name = gr.Textbox(
732
+ label="Watermark Name",
733
+ placeholder="e.g., MyWatermark, Watermark-X",
734
+ info="Unique identifier for your watermark"
735
+ )
736
+ paper_link = gr.Textbox(
737
+ label="Paper Link (Optional)",
738
+ placeholder="https://arxiv.org/abs/xxxx.xxxxx or https://...",
739
+ info="Link to the paper describing this watermark method"
740
+ )
741
+ submission_model = gr.Radio(
742
+ choices=["LLaMA3", "DeepSeek"],
743
+ label="Model",
744
+ value="LLaMA3",
745
+ info="Select the model used"
746
+ )
747
+
748
+ with gr.Column(scale=1):
749
+ # Attack-free Metrics (Optional)
750
+ gr.HTML("<div style='text-align: center; margin-bottom: 15px;'><h3 style='color: #0277BD; margin: 0;'>⚡ Attack-free Metrics (Optional - Both Required if One is Provided)</h3></div>")
751
+ normalized_utility = gr.Number(
752
+ label="Normalized Utility",
753
+ value=None,
754
+ minimum=0.0,
755
+ maximum=1.0,
756
+ step=0.001,
757
+ info="Text quality metric (0.000 - 1.000)"
758
+ )
759
+ detection_rate = gr.Number(
760
+ label="Detection Rate (%)",
761
+ value=None,
762
+ minimum=0.0,
763
+ maximum=100.0,
764
+ step=0.001,
765
+ info="Watermark detection accuracy (0.000 - 100.000%)"
766
+ )
767
+
768
+ with gr.Row():
769
+ with gr.Column(scale=1):
770
+ # Watermark Removal Metrics (Optional)
771
+ gr.HTML("<div style='text-align: center; margin-bottom: 15px;'><h3 style='color: #0277BD; margin: 0;'>🛡️ Watermark Removal (Optional)</h3></div>")
772
+ absolute_utility_degradation = gr.Number(
773
+ label="Absolute Utility Degradation",
774
+ value=None,
775
+ minimum=0.0,
776
+ maximum=1.0,
777
+ step=0.001,
778
+ info="Resistance to removal attacks (0.000 - 1.000)"
779
+ )
780
+ removal_detection_rate = gr.Number(
781
+ label="Removal Detection Rate (%)",
782
+ value=None,
783
+ minimum=0.0,
784
+ maximum=100.0,
785
+ step=0.001,
786
+ info="Detection rate under removal attacks (0.000 - 100.000%)"
787
+ )
788
+
789
+ with gr.Column(scale=1):
790
+ # Stealing Attack Metrics (Optional)
791
+ gr.HTML("<div style='text-align: center; margin-bottom: 15px;'><h3 style='color: #0277BD; margin: 0;'>🎯 Stealing Attack (Optional)</h3></div>")
792
+ adversary_bert_score = gr.Number(
793
+ label="Adversary BERT Score",
794
+ value=None,
795
+ minimum=0.0,
796
+ maximum=1.0,
797
+ step=0.001,
798
+ info="Performance under adversarial conditions (0.000 - 1.000)"
799
+ )
800
+ adversary_detection_rate = gr.Number(
801
+ label="Adversary Detection Rate (%)",
802
+ value=None,
803
+ minimum=0.0,
804
+ maximum=100.0,
805
+ step=0.001,
806
+ info="Detection rate under adversarial attacks (0.000 - 100.000%)"
807
+ )
808
+
809
+ # Submit and Clear buttons
810
+ with gr.Row():
811
+ with gr.Column(scale=1):
812
+ submit_btn = gr.Button(
813
+ "🚀 Submit Data to Leaderboard",
814
+ variant="primary",
815
+ size="lg",
816
+ elem_id="submit-btn"
817
+ )
818
+ with gr.Column(scale=1):
819
+ clear_btn = gr.Button(
820
+ "🗑️ Clear Form",
821
+ variant="secondary",
822
+ size="lg"
823
+ )
824
+
825
+ # Status message
826
+ status_message = gr.Markdown("", visible=True)
827
+
828
+
829
+ # Scatter Plot
830
+ scatter_plot = gr.Plot(
831
+ label="Performance Scatter Plot",
832
+ show_label=True
833
+ )
834
+
835
+ # Table
836
+ table = gr.DataFrame(
837
+ label="Performance Table",
838
+ show_label=True,
839
+ interactive=False,
840
+ wrap=True
841
+ )
842
+
843
+ # Guideline and Metrics Explained Section (At bottom with light green background)
844
+ with gr.Accordion("📋 Guideline for Submitting Watermark Performance Results", open=False, elem_id="guideline-section"):
845
+ gr.HTML("""
846
+ <div style="padding: 20px;">
847
+ <h3>Guideline for Submitting Watermark Performance Results</h3>
848
+ <h4>1. Datasets</h4>
849
+ <ul>
850
+ <li><strong>Text Generation (C4 dataset)</strong>
851
+ <ul>
852
+ <li>Training: first 20,000 samples</li>
853
+ <li>Testing: 13,860 samples</li>
854
+ <li>Reference script: <code>Files/Reproducibility/C4_dataset_download.py</code></li>
855
+ </ul>
856
+ </li>
857
+ <li><strong>Text Summarization (CNN/Daily Mail dataset)</strong>
858
+ <ul>
859
+ <li>Training: first 10,000–20,000 samples</li>
860
+ <li>Testing: 1,000 samples</li>
861
+ <li>Reference script: <code>Files/Reproducibility/CNN_dataset_download.py</code></li>
862
+ </ul>
863
+ </li>
864
+ </ul>
865
+ <h4>2. Models</h4>
866
+ <ul>
867
+ <li>Use open-source models available on Hugging Face:
868
+ <ul>
869
+ <li>DeepSeek: "deepseek-ai/deepseek-llm-7b-base"</li>
870
+ <li>LLaMA-3: "meta-llama/Meta-Llama-3-8B"</li>
871
+ </ul>
872
+ </li>
873
+ </ul>
874
+ <h4>3. Evaluation Settings</h4>
875
+ <ul>
876
+ <li><strong>(a) Attack-Free Setting</strong>
877
+ <ul>
878
+ <li>Generate 13,860 watermarked outputs on the C4 test set.</li>
879
+ <li>Report: Detection Rate and Normalized Utility (see Metrics).</li>
880
+ </ul>
881
+ </li>
882
+ <li><strong>(b) Watermark Removal Setting</strong>
883
+ <ul>
884
+ <li>Apply Dipper to paraphrase watermarked outputs.</li>
885
+ <li>Report:
886
+ <ul>
887
+ <li>Detection Rate after attack</li>
888
+ <li>Normalized Utility after attack</li>
889
+ <li>Absolute Utility Degradation (difference before vs. after attack)</li>
890
+ </ul>
891
+ </li>
892
+ <li>Reference scripts: <code>Files/Reproducibility/Attack_dipper.py</code></li>
893
+ </ul>
894
+ </li>
895
+ <li><strong>(c) Stealing Attack Setting</strong>
896
+ <ul>
897
+ <li>Generate 20,000 watermarked samples for training a surrogate model using LoRA.</li>
898
+ <li>Use the surrogate model for summarization on 1,000 test samples.</li>
899
+ <li>Report: Detection Rate and Normalized Utility on the surrogate's outputs.</li>
900
+ <li>Reference scripts: <code>Files/Reproducibility/Finetune_sum.py</code>, <code>Files/Reproducibility/Inference_sum.py</code></li>
901
+ </ul>
902
+ </li>
903
+ </ul>
904
+ <h4>4. Metrics</h4>
905
+ <ul>
906
+ <li><strong>Detection Rate</strong>
907
+ <ul>
908
+ <li>Average accuracy across the test set (e.g., 13,860 examples for text generation).</li>
909
+ <li>Use your own detector implementation.</li>
910
+ </ul>
911
+ </li>
912
+ <li><strong>Normalized Utility</strong>
913
+ <ul>
914
+ <li>Defined as the mean of:</li>
915
+ <li>BERTScore (<code>Files/Reproducibility/BERT_score.py</code>)</li>
916
+ <li>Entity Similarity Score (<code>Files/Reproducibility/Entity_similarity_score.py</code>)</li>
917
+ </ul>
918
+ </li>
919
+ <li><strong>Absolute Utility Degradation</strong>
920
+ <ul>
921
+ <li>The absolute change in Normalized Utility between attack-free and attacked outputs.</li>
922
+ </ul>
923
+ </li>
924
+ </ul>
925
+ <h4>5. Submission</h4>
926
+ <ul>
927
+ <li>You may submit results for one or more evaluation settings (Attack-Free, Removal, Stealing).</li>
928
+ <li>Please include:
929
+ <ul>
930
+ <li>Model(s) evaluated</li>
931
+ <li>Dataset(s) used</li>
932
+ <li>Scripts/configuration details if modified</li>
933
+ <li>Reported metrics in the required format</li>
934
+ </ul>
935
+ </li>
936
+ </ul>
937
+ <p><strong>Reproducibility codes are available in the Files tab of this Space.</strong></p>
938
+ </div>
939
+ """)
940
+
941
+ # Owner Approval Section (At the very bottom)
942
+ with gr.Accordion("🔒 Owner Controls - Pending Submissions", open=False, elem_id="owner-controls"):
943
+ gr.HTML("""
944
+ <div style='text-align: center; margin-bottom: 20px;'>
945
+ <h2 style='color: #D32F2F; margin: 0; font-size: 1.5rem;'>🛡️ Administrator Approval Panel</h2>
946
+ <p style='color: #374151; margin: 10px 0 0 0;'>Review and approve pending submissions before they appear on the leaderboard</p>
947
+ </div>
948
+ """)
949
+
950
+ # Pending submissions table
951
+ pending_table = gr.DataFrame(
952
+ label="📋 Pending Submissions",
953
+ show_label=True,
954
+ interactive=False,
955
+ wrap=True,
956
+ headers=["ID", "Name", "Model", "Paper Link", "Attack-free Utility", "Attack-free Detection",
957
+ "Removal Degradation", "Removal Detection", "Adversary BERT", "Adversary Detection", "Submitted At"]
958
+ )
959
+
960
+ # Admin authentication
961
+ admin_password_input = gr.Textbox(
962
+ label="🔐 Admin Password",
963
+ placeholder="Enter admin password to access controls",
964
+ type="password",
965
+ info="Required for approval/rejection actions"
966
+ )
967
+
968
+ # Approval controls
969
+ with gr.Row():
970
+ with gr.Column(scale=1):
971
+ submission_id_input = gr.Textbox(
972
+ label="Submission ID",
973
+ placeholder="Enter submission ID to approve/reject",
974
+ info="Copy from the pending submissions table"
975
+ )
976
+ approve_btn = gr.Button(
977
+ "✅ Approve Submission",
978
+ variant="primary",
979
+ size="lg",
980
+ elem_id="approve-btn"
981
+ )
982
+ with gr.Column(scale=1):
983
+ reject_btn = gr.Button(
984
+ "❌ Reject Submission",
985
+ variant="stop",
986
+ size="lg",
987
+ elem_id="reject-btn"
988
+ )
989
+ refresh_pending_btn = gr.Button(
990
+ "🔄 Refresh Pending",
991
+ variant="secondary",
992
+ size="lg"
993
+ )
994
+
995
+ approval_status = gr.Markdown("", visible=True)
996
+
997
+ # Event handlers
998
+ model_selector.change(
999
+ fn=update_interface,
1000
+ inputs=[model_selector, metric_selector],
1001
+ outputs=[scatter_plot, table]
1002
+ )
1003
+
1004
+ metric_selector.change(
1005
+ fn=update_interface,
1006
+ inputs=[model_selector, metric_selector],
1007
+ outputs=[scatter_plot, table]
1008
+ )
1009
+
1010
+ # Form submission handler
1011
+ submit_btn.click(
1012
+ fn=submit_watermark_data,
1013
+ inputs=[
1014
+ watermark_name,
1015
+ submission_model,
1016
+ paper_link,
1017
+ normalized_utility,
1018
+ detection_rate,
1019
+ absolute_utility_degradation,
1020
+ removal_detection_rate,
1021
+ adversary_bert_score,
1022
+ adversary_detection_rate
1023
+ ],
1024
+ outputs=[status_message, scatter_plot, table]
1025
+ )
1026
+
1027
+ # Clear form handler
1028
+ clear_btn.click(
1029
+ fn=clear_form,
1030
+ outputs=[
1031
+ watermark_name,
1032
+ paper_link,
1033
+ submission_model,
1034
+ normalized_utility,
1035
+ detection_rate,
1036
+ absolute_utility_degradation,
1037
+ removal_detection_rate,
1038
+ adversary_bert_score,
1039
+ adversary_detection_rate
1040
+ ]
1041
+ )
1042
+
1043
+ # Add data button handler
1044
+ # The add_data_button is removed, so this handler is no longer needed.
1045
+ # The highlighted section is now always visible.
1046
+
1047
+ # Owner approval event handlers
1048
+ approve_btn.click(
1049
+ fn=approve_submission,
1050
+ inputs=[submission_id_input, admin_password_input],
1051
+ outputs=[approval_status, pending_table]
1052
+ )
1053
+
1054
+ reject_btn.click(
1055
+ fn=reject_submission,
1056
+ inputs=[submission_id_input, admin_password_input],
1057
+ outputs=[approval_status, pending_table]
1058
+ )
1059
+
1060
+ refresh_pending_btn.click(
1061
+ fn=load_pending_submissions,
1062
+ outputs=[pending_table]
1063
+ )
1064
+
1065
+ # Initial load
1066
+ demo.load(
1067
+ fn=lambda: update_interface("LLaMA3", "Attack-free"),
1068
+ outputs=[scatter_plot, table]
1069
+ )
1070
+
1071
+ # Load pending submissions on startup
1072
+ demo.load(
1073
+ fn=load_pending_submissions,
1074
+ outputs=[pending_table]
1075
+ )
1076
+
1077
+ # Clear admin password after actions for security
1078
+ def clear_admin_password():
1079
+ return gr.update(value="")
1080
+
1081
+ # Clear password after approve/reject actions
1082
+ approve_btn.click(
1083
+ fn=clear_admin_password,
1084
+ outputs=[admin_password_input]
1085
+ )
1086
+
1087
+ reject_btn.click(
1088
+ fn=clear_admin_password,
1089
+ outputs=[admin_password_input]
1090
+ )
1091
+
1092
+ return demo
1093
+
1094
+ # Create and launch the interface
1095
+ if __name__ == "__main__":
1096
+ demo = create_interface()
1097
+ demo.launch()
assets/index-Cd6CRo7g.js ADDED
The diff for this file is too large to render. See raw diff
 
assets/index-tTSI8ghR.css ADDED
@@ -0,0 +1 @@
 
 
1
+ *,:before,:after{--tw-border-spacing-x: 0;--tw-border-spacing-y: 0;--tw-translate-x: 0;--tw-translate-y: 0;--tw-rotate: 0;--tw-skew-x: 0;--tw-skew-y: 0;--tw-scale-x: 1;--tw-scale-y: 1;--tw-pan-x: ;--tw-pan-y: ;--tw-pinch-zoom: ;--tw-scroll-snap-strictness: proximity;--tw-gradient-from-position: ;--tw-gradient-via-position: ;--tw-gradient-to-position: ;--tw-ordinal: ;--tw-slashed-zero: ;--tw-numeric-figure: ;--tw-numeric-spacing: ;--tw-numeric-fraction: ;--tw-ring-inset: ;--tw-ring-offset-width: 0px;--tw-ring-offset-color: #fff;--tw-ring-color: rgb(59 130 246 / .5);--tw-ring-offset-shadow: 0 0 #0000;--tw-ring-shadow: 0 0 #0000;--tw-shadow: 0 0 #0000;--tw-shadow-colored: 0 0 #0000;--tw-blur: ;--tw-brightness: ;--tw-contrast: ;--tw-grayscale: ;--tw-hue-rotate: ;--tw-invert: ;--tw-saturate: ;--tw-sepia: ;--tw-drop-shadow: ;--tw-backdrop-blur: ;--tw-backdrop-brightness: ;--tw-backdrop-contrast: ;--tw-backdrop-grayscale: ;--tw-backdrop-hue-rotate: ;--tw-backdrop-invert: ;--tw-backdrop-opacity: ;--tw-backdrop-saturate: ;--tw-backdrop-sepia: ;--tw-contain-size: ;--tw-contain-layout: ;--tw-contain-paint: ;--tw-contain-style: }::backdrop{--tw-border-spacing-x: 0;--tw-border-spacing-y: 0;--tw-translate-x: 0;--tw-translate-y: 0;--tw-rotate: 0;--tw-skew-x: 0;--tw-skew-y: 0;--tw-scale-x: 1;--tw-scale-y: 1;--tw-pan-x: ;--tw-pan-y: ;--tw-pinch-zoom: ;--tw-scroll-snap-strictness: proximity;--tw-gradient-from-position: ;--tw-gradient-via-position: ;--tw-gradient-to-position: ;--tw-ordinal: ;--tw-slashed-zero: ;--tw-numeric-figure: ;--tw-numeric-spacing: ;--tw-numeric-fraction: ;--tw-ring-inset: ;--tw-ring-offset-width: 0px;--tw-ring-offset-color: #fff;--tw-ring-color: rgb(59 130 246 / .5);--tw-ring-offset-shadow: 0 0 #0000;--tw-ring-shadow: 0 0 #0000;--tw-shadow: 0 0 #0000;--tw-shadow-colored: 0 0 #0000;--tw-blur: ;--tw-brightness: ;--tw-contrast: ;--tw-grayscale: ;--tw-hue-rotate: ;--tw-invert: ;--tw-saturate: ;--tw-sepia: ;--tw-drop-shadow: ;--tw-backdrop-blur: ;--tw-backdrop-brightness: ;--tw-backdrop-contrast: ;--tw-backdrop-grayscale: ;--tw-backdrop-hue-rotate: ;--tw-backdrop-invert: ;--tw-backdrop-opacity: ;--tw-backdrop-saturate: ;--tw-backdrop-sepia: ;--tw-contain-size: ;--tw-contain-layout: ;--tw-contain-paint: ;--tw-contain-style: }*,:before,:after{box-sizing:border-box;border-width:0;border-style:solid;border-color:#e5e7eb}:before,:after{--tw-content: ""}html,:host{line-height:1.5;-webkit-text-size-adjust:100%;-moz-tab-size:4;-o-tab-size:4;tab-size:4;font-family:ui-sans-serif,system-ui,sans-serif,"Apple Color Emoji","Segoe UI Emoji",Segoe UI Symbol,"Noto Color Emoji";font-feature-settings:normal;font-variation-settings:normal;-webkit-tap-highlight-color:transparent}body{margin:0;line-height:inherit}hr{height:0;color:inherit;border-top-width:1px}abbr:where([title]){-webkit-text-decoration:underline dotted;text-decoration:underline dotted}h1,h2,h3,h4,h5,h6{font-size:inherit;font-weight:inherit}a{color:inherit;text-decoration:inherit}b,strong{font-weight:bolder}code,kbd,samp,pre{font-family:ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,monospace;font-feature-settings:normal;font-variation-settings:normal;font-size:1em}small{font-size:80%}sub,sup{font-size:75%;line-height:0;position:relative;vertical-align:baseline}sub{bottom:-.25em}sup{top:-.5em}table{text-indent:0;border-color:inherit;border-collapse:collapse}button,input,optgroup,select,textarea{font-family:inherit;font-feature-settings:inherit;font-variation-settings:inherit;font-size:100%;font-weight:inherit;line-height:inherit;letter-spacing:inherit;color:inherit;margin:0;padding:0}button,select{text-transform:none}button,input:where([type=button]),input:where([type=reset]),input:where([type=submit]){-webkit-appearance:button;background-color:transparent;background-image:none}:-moz-focusring{outline:auto}:-moz-ui-invalid{box-shadow:none}progress{vertical-align:baseline}::-webkit-inner-spin-button,::-webkit-outer-spin-button{height:auto}[type=search]{-webkit-appearance:textfield;outline-offset:-2px}::-webkit-search-decoration{-webkit-appearance:none}::-webkit-file-upload-button{-webkit-appearance:button;font:inherit}summary{display:list-item}blockquote,dl,dd,h1,h2,h3,h4,h5,h6,hr,figure,p,pre{margin:0}fieldset{margin:0;padding:0}legend{padding:0}ol,ul,menu{list-style:none;margin:0;padding:0}dialog{padding:0}textarea{resize:vertical}input::-moz-placeholder,textarea::-moz-placeholder{opacity:1;color:#9ca3af}input::placeholder,textarea::placeholder{opacity:1;color:#9ca3af}button,[role=button]{cursor:pointer}:disabled{cursor:default}img,svg,video,canvas,audio,iframe,embed,object{display:block;vertical-align:middle}img,video{max-width:100%;height:auto}[hidden]:where(:not([hidden=until-found])){display:none}.pointer-events-none{pointer-events:none}.pointer-events-auto{pointer-events:auto}.static{position:static}.fixed{position:fixed}.absolute{position:absolute}.inset-0{inset:0}.bottom-24{bottom:6rem}.left-0{left:0}.z-10{z-index:10}.z-40{z-index:40}.z-50{z-index:50}.m-0{margin:0}.mx-auto{margin-left:auto;margin-right:auto}.my-8{margin-top:2rem;margin-bottom:2rem}.mb-2{margin-bottom:.5rem}.mb-4{margin-bottom:1rem}.mb-8{margin-bottom:2rem}.mt-2{margin-top:.5rem}.mt-4{margin-top:1rem}.mt-6{margin-top:1.5rem}.mt-8{margin-top:2rem}.block{display:block}.inline-block{display:inline-block}.flex{display:flex}.table{display:table}.grid{display:grid}.hidden{display:none}.h-10{height:2.5rem}.h-12{height:3rem}.h-16{height:4rem}.h-4\/5{height:80%}.h-full{height:100%}.min-h-screen{min-height:100vh}.w-12{width:3rem}.w-14{width:3.5rem}.w-full{width:100%}.w-min{width:-moz-min-content;width:min-content}.max-w-4xl{max-width:56rem}.max-w-5xl{max-width:64rem}.flex-1{flex:1 1 0%}.flex-shrink-0{flex-shrink:0}.scale-105{--tw-scale-x: 1.05;--tw-scale-y: 1.05;transform:translate(var(--tw-translate-x),var(--tw-translate-y)) rotate(var(--tw-rotate)) skew(var(--tw-skew-x)) skewY(var(--tw-skew-y)) scaleX(var(--tw-scale-x)) scaleY(var(--tw-scale-y))}.transform{transform:translate(var(--tw-translate-x),var(--tw-translate-y)) rotate(var(--tw-rotate)) skew(var(--tw-skew-x)) skewY(var(--tw-skew-y)) scaleX(var(--tw-scale-x)) scaleY(var(--tw-scale-y))}.cursor-pointer{cursor:pointer}.select-none{-webkit-user-select:none;-moz-user-select:none;user-select:none}.list-disc{list-style-type:disc}.grid-cols-1{grid-template-columns:repeat(1,minmax(0,1fr))}.flex-row{flex-direction:row}.flex-col{flex-direction:column}.items-center{align-items:center}.justify-end{justify-content:flex-end}.justify-center{justify-content:center}.justify-between{justify-content:space-between}.gap-2{gap:.5rem}.gap-4{gap:1rem}.space-x-2>:not([hidden])~:not([hidden]){--tw-space-x-reverse: 0;margin-right:calc(.5rem * var(--tw-space-x-reverse));margin-left:calc(.5rem * calc(1 - var(--tw-space-x-reverse)))}.space-y-1>:not([hidden])~:not([hidden]){--tw-space-y-reverse: 0;margin-top:calc(.25rem * calc(1 - var(--tw-space-y-reverse)));margin-bottom:calc(.25rem * var(--tw-space-y-reverse))}.space-y-2>:not([hidden])~:not([hidden]){--tw-space-y-reverse: 0;margin-top:calc(.5rem * calc(1 - var(--tw-space-y-reverse)));margin-bottom:calc(.5rem * var(--tw-space-y-reverse))}.space-y-3>:not([hidden])~:not([hidden]){--tw-space-y-reverse: 0;margin-top:calc(.75rem * calc(1 - var(--tw-space-y-reverse)));margin-bottom:calc(.75rem * var(--tw-space-y-reverse))}.space-y-4>:not([hidden])~:not([hidden]){--tw-space-y-reverse: 0;margin-top:calc(1rem * calc(1 - var(--tw-space-y-reverse)));margin-bottom:calc(1rem * var(--tw-space-y-reverse))}.space-y-6>:not([hidden])~:not([hidden]){--tw-space-y-reverse: 0;margin-top:calc(1.5rem * calc(1 - var(--tw-space-y-reverse)));margin-bottom:calc(1.5rem * var(--tw-space-y-reverse))}.overflow-hidden{overflow:hidden}.overflow-clip{overflow:clip}.overflow-y-auto{overflow-y:auto}.whitespace-nowrap{white-space:nowrap}.rounded{border-radius:.25rem}.rounded-2xl{border-radius:1rem}.rounded-3xl{border-radius:1.5rem}.rounded-lg{border-radius:.5rem}.border{border-width:1px}.border-b{border-bottom-width:1px}.border-b-2{border-bottom-width:2px}.border-t{border-top-width:1px}.border-border{border-color:var(--border)}.border-green-400{--tw-border-opacity: 1;border-color:rgb(74 222 128 / var(--tw-border-opacity, 1))}.border-transparent{border-color:var(--transparent)}.bg-bg{background-color:var(--bg)}.bg-bg-dark{background-color:var(--bg-dark)}.bg-bg-light{background-color:var(--bg-light)}.bg-black\/20{background-color:#0003}.bg-primary{background-color:var(--primary)}.bg-gradient-to-r{background-image:linear-gradient(to right,var(--tw-gradient-stops))}.from-blue-500{--tw-gradient-from: #3b82f6 var(--tw-gradient-from-position);--tw-gradient-to: rgb(59 130 246 / 0) var(--tw-gradient-to-position);--tw-gradient-stops: var(--tw-gradient-from), var(--tw-gradient-to)}.from-green-500{--tw-gradient-from: #22c55e var(--tw-gradient-from-position);--tw-gradient-to: rgb(34 197 94 / 0) var(--tw-gradient-to-position);--tw-gradient-stops: var(--tw-gradient-from), var(--tw-gradient-to)}.to-green-600{--tw-gradient-to: #16a34a var(--tw-gradient-to-position)}.to-purple-600{--tw-gradient-to: #9333ea var(--tw-gradient-to-position)}.p-2{padding:.5rem}.p-3{padding:.75rem}.p-4{padding:1rem}.p-6{padding:1.5rem}.px-10{padding-left:2.5rem;padding-right:2.5rem}.px-2{padding-left:.5rem;padding-right:.5rem}.px-3{padding-left:.75rem;padding-right:.75rem}.px-4{padding-left:1rem;padding-right:1rem}.px-6{padding-left:1.5rem;padding-right:1.5rem}.py-1{padding-top:.25rem;padding-bottom:.25rem}.py-2{padding-top:.5rem;padding-bottom:.5rem}.py-3{padding-top:.75rem;padding-bottom:.75rem}.pb-2{padding-bottom:.5rem}.pb-4{padding-bottom:1rem}.pl-6{padding-left:1.5rem}.pr-12{padding-right:3rem}.pt-2{padding-top:.5rem}.pt-4{padding-top:1rem}.pt-8{padding-top:2rem}.text-center{text-align:center}.text-2xl{font-size:1.5rem;line-height:2rem}.text-4xl{font-size:2.25rem;line-height:2.5rem}.text-6xl{font-size:3.75rem;line-height:1}.text-lg{font-size:1.125rem;line-height:1.75rem}.text-sm{font-size:.875rem;line-height:1.25rem}.text-xl{font-size:1.25rem;line-height:1.75rem}.font-bold{font-weight:700}.font-medium{font-weight:500}.font-semibold{font-weight:600}.text-black{--tw-text-opacity: 1;color:rgb(0 0 0 / var(--tw-text-opacity, 1))}.text-primary{color:var(--primary)}.text-text{color:var(--text)}.text-text-muted{color:var(--text-muted)}.text-white{--tw-text-opacity: 1;color:rgb(255 255 255 / var(--tw-text-opacity, 1))}.placeholder-gray-400::-moz-placeholder{--tw-placeholder-opacity: 1;color:rgb(156 163 175 / var(--tw-placeholder-opacity, 1))}.placeholder-gray-400::placeholder{--tw-placeholder-opacity: 1;color:rgb(156 163 175 / var(--tw-placeholder-opacity, 1))}.opacity-0{opacity:0}.shadow-lg{--tw-shadow: 0 10px 15px -3px rgb(0 0 0 / .1), 0 4px 6px -4px rgb(0 0 0 / .1);--tw-shadow-colored: 0 10px 15px -3px var(--tw-shadow-color), 0 4px 6px -4px var(--tw-shadow-color);box-shadow:var(--tw-ring-offset-shadow, 0 0 #0000),var(--tw-ring-shadow, 0 0 #0000),var(--tw-shadow)}.shadow-md{--tw-shadow: 0 4px 6px -1px rgb(0 0 0 / .1), 0 2px 4px -2px rgb(0 0 0 / .1);--tw-shadow-colored: 0 4px 6px -1px var(--tw-shadow-color), 0 2px 4px -2px var(--tw-shadow-color);box-shadow:var(--tw-ring-offset-shadow, 0 0 #0000),var(--tw-ring-shadow, 0 0 #0000),var(--tw-shadow)}.filter{filter:var(--tw-blur) var(--tw-brightness) var(--tw-contrast) var(--tw-grayscale) var(--tw-hue-rotate) var(--tw-invert) var(--tw-saturate) var(--tw-sepia) var(--tw-drop-shadow)}.backdrop-blur-sm{--tw-backdrop-blur: blur(4px);-webkit-backdrop-filter:var(--tw-backdrop-blur) var(--tw-backdrop-brightness) var(--tw-backdrop-contrast) var(--tw-backdrop-grayscale) var(--tw-backdrop-hue-rotate) var(--tw-backdrop-invert) var(--tw-backdrop-opacity) var(--tw-backdrop-saturate) var(--tw-backdrop-sepia);backdrop-filter:var(--tw-backdrop-blur) var(--tw-backdrop-brightness) var(--tw-backdrop-contrast) var(--tw-backdrop-grayscale) var(--tw-backdrop-hue-rotate) var(--tw-backdrop-invert) var(--tw-backdrop-opacity) var(--tw-backdrop-saturate) var(--tw-backdrop-sepia)}.transition{transition-property:color,background-color,border-color,text-decoration-color,fill,stroke,opacity,box-shadow,transform,filter,-webkit-backdrop-filter;transition-property:color,background-color,border-color,text-decoration-color,fill,stroke,opacity,box-shadow,transform,filter,backdrop-filter;transition-property:color,background-color,border-color,text-decoration-color,fill,stroke,opacity,box-shadow,transform,filter,backdrop-filter,-webkit-backdrop-filter;transition-timing-function:cubic-bezier(.4,0,.2,1);transition-duration:.15s}.transition-all{transition-property:all;transition-timing-function:cubic-bezier(.4,0,.2,1);transition-duration:.15s}.transition-colors{transition-property:color,background-color,border-color,text-decoration-color,fill,stroke;transition-timing-function:cubic-bezier(.4,0,.2,1);transition-duration:.15s}.duration-300{transition-duration:.3s}.duration-500,.duration-long{transition-duration:.5s}.duration-short{transition-duration:.3s}.ease-in-out{transition-timing-function:cubic-bezier(.4,0,.2,1)}:root{font-family:system-ui,Avenir,Helvetica,Arial,sans-serif;line-height:1.5;font-weight:400;color-scheme:light dark;color:#ffffffde;background-color:#242424;font-synthesis:none;text-rendering:optimizeLegibility;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale}a{font-weight:500;color:#646cff;text-decoration:inherit}a:hover{color:#535bf2}body{margin:0;min-width:320px;min-height:100vh}h1{font-size:3.2em;line-height:1.1}button{border-radius:8px;border:1px solid transparent;padding:.6em 1.2em;font-size:1em;font-weight:500;font-family:inherit;background-color:#1a1a1a;cursor:pointer;transition:border-color .25s}button:hover{border-color:#646cff}button:focus,button:focus-visible{outline:4px auto -webkit-focus-ring-color}@media (prefers-color-scheme: light){:root{color:#213547;background-color:#fff}a:hover{color:#747bff}button{background-color:#f9f9f9}}.no-scrollbar::-webkit-scrollbar{display:none}.no-scrollbar{-ms-overflow-style:none;scrollbar-width:none}:root{--bg-dark: hsl(220 59% 91%);--bg: hsl(220 100% 97%);--bg-light: hsl(220 100% 100%);--text: hsl(226 85% 7%);--text-muted: hsl(220 26% 31%);--highlight: hsl(220 100% 100%);--border: hsl(220 19% 53%);--border-muted: hsl(220 27% 65%);--primary: hsl(219 78% 50%);--secondary: hsl(39 54% 61%);--danger: hsl(9 26% 64%);--warning: hsl(52 19% 57%);--success: hsl(146 17% 59%);--info: hsl(217 28% 65%);--transparent: rgba(206, 25, 25, 0)}.dark{--bg-dark: hsl(336 0% 1%);--bg: hsl(300 0% 4%);--bg-light: hsl(0 0% 9%);--text: hsl(300 0% 95%);--text-muted: hsl(300 0% 69%);--highlight: hsl(330 0% 39%);--border: hsl(0 0% 28%);--border-muted: hsl(300 0% 18%);--primary: hsl(219 78% 50%);--secondary: hsl(39 54% 61%);--danger: hsl(9 26% 64%);--warning: hsl(52 19% 57%);--success: hsl(146 17% 59%);--info: hsl(217 28% 65%);--transparent: rgba(225, 1, 1, 0)}*{transition:color .3s ease,background-color .3s ease,border-color .3s ease,box-shadow .3s ease}.hover\:border-blue-400:hover{--tw-border-opacity: 1;border-color:rgb(96 165 250 / var(--tw-border-opacity, 1))}.hover\:border-green-500:hover{--tw-border-opacity: 1;border-color:rgb(34 197 94 / var(--tw-border-opacity, 1))}.hover\:border-primary:hover{border-color:var(--primary)}.hover\:border-red-900:hover{--tw-border-opacity: 1;border-color:rgb(127 29 29 / var(--tw-border-opacity, 1))}.hover\:border-secondary:hover{border-color:var(--secondary)}.hover\:bg-gray-700\/30:hover{background-color:#3741514d}.hover\:bg-primary:hover{background-color:var(--primary)}.hover\:from-green-600:hover{--tw-gradient-from: #16a34a var(--tw-gradient-from-position);--tw-gradient-to: rgb(22 163 74 / 0) var(--tw-gradient-to-position);--tw-gradient-stops: var(--tw-gradient-from), var(--tw-gradient-to)}.hover\:to-green-700:hover{--tw-gradient-to: #15803d var(--tw-gradient-to-position)}.hover\:text-bg-dark:hover{color:var(--bg-dark)}.hover\:text-text:hover{color:var(--text)}.hover\:opacity-90:hover{opacity:.9}.hover\:shadow-lg:hover{--tw-shadow: 0 10px 15px -3px rgb(0 0 0 / .1), 0 4px 6px -4px rgb(0 0 0 / .1);--tw-shadow-colored: 0 10px 15px -3px var(--tw-shadow-color), 0 4px 6px -4px var(--tw-shadow-color);box-shadow:var(--tw-ring-offset-shadow, 0 0 #0000),var(--tw-ring-shadow, 0 0 #0000),var(--tw-shadow)}.hover\:shadow-xl:hover{--tw-shadow: 0 20px 25px -5px rgb(0 0 0 / .1), 0 8px 10px -6px rgb(0 0 0 / .1);--tw-shadow-colored: 0 20px 25px -5px var(--tw-shadow-color), 0 8px 10px -6px var(--tw-shadow-color);box-shadow:var(--tw-ring-offset-shadow, 0 0 #0000),var(--tw-ring-shadow, 0 0 #0000),var(--tw-shadow)}.focus\:border-primary:focus{border-color:var(--primary)}.focus\:outline-none:focus{outline:2px solid transparent;outline-offset:2px}@media (min-width: 768px){.md\:left-1\/4{left:25%}.md\:w-1\/2{width:50%}.md\:grid-cols-2{grid-template-columns:repeat(2,minmax(0,1fr))}.md\:p-0{padding:0}}
deploy_to_huggingface.py ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Script to deploy the Watermark Leaderboard to Hugging Face Spaces
4
+ """
5
+
6
+ import os
7
+ import shutil
8
+ import json
9
+ from pathlib import Path
10
+
11
+ def copy_files_to_hf_directory():
12
+ """Copy necessary files to the Hugging Face deployment directory"""
13
+
14
+ # Files to copy from the main project
15
+ source_dir = Path("../")
16
+ hf_dir = Path(".")
17
+
18
+ # Essential files for Hugging Face deployment
19
+ files_to_copy = [
20
+ "app.py",
21
+ "requirements.txt",
22
+ "README.md",
23
+ "leaderboard.json"
24
+ ]
25
+
26
+ # Copy Reproducibility folder if it exists
27
+ reproducibility_source = source_dir / "Reproducibility"
28
+ if reproducibility_source.exists():
29
+ reproducibility_dest = hf_dir / "Reproducibility"
30
+ if reproducibility_dest.exists():
31
+ shutil.rmtree(reproducibility_dest)
32
+ shutil.copytree(reproducibility_source, reproducibility_dest)
33
+ print("✅ Copied Reproducibility folder")
34
+
35
+ # Copy individual files
36
+ for file_name in files_to_copy:
37
+ source_file = source_dir / file_name
38
+ dest_file = hf_dir / file_name
39
+
40
+ if source_file.exists():
41
+ shutil.copy2(source_file, dest_file)
42
+ print(f"✅ Copied {file_name}")
43
+ else:
44
+ print(f"⚠️ {file_name} not found in source directory")
45
+
46
+ print("\n🎉 Files copied successfully!")
47
+ print("\nNext steps:")
48
+ print("1. Create a new Hugging Face Space")
49
+ print("2. Upload all files in this directory")
50
+ print("3. Set the Space to use Gradio SDK")
51
+ print("4. Your leaderboard will be live!")
52
+
53
+ def create_hf_readme():
54
+ """Create a Hugging Face specific README"""
55
+ readme_content = """---
56
+ title: Watermark Leaderboard
57
+ emoji: 🏆
58
+ colorFrom: blue
59
+ colorTo: green
60
+ sdk: gradio
61
+ sdk_version: "4.44.0"
62
+ app_file: app.py
63
+ pinned: false
64
+ license: mit
65
+ short_description: Interactive leaderboard for watermark performance evaluation
66
+ ---
67
+
68
+ # Watermark Leaderboard 🏆
69
+
70
+ An interactive leaderboard for comparing watermark performance across different models and evaluation settings.
71
+
72
+ ## Features
73
+
74
+ - **Interactive Scatter Plot**: Visualize watermark performance with Plotly charts
75
+ - **Performance Table**: Detailed metrics with sorting and filtering
76
+ - **Multiple Evaluation Settings**: Attack-free, Watermark Removal, and Stealing Attack
77
+ - **Model Support**: LLaMA3 and DeepSeek models
78
+ - **Dynamic Filtering**: Real-time updates based on model and metric selection
79
+ - **Flexible Submissions**: Submit data for any combination of attack types
80
+ - **Pending Approval System**: All submissions reviewed before appearing on leaderboard
81
+ - **Complete Field Visibility**: Administrators see all submission details for review
82
+ - **Professional UI**: Clean, modern interface with accordion sections
83
+ - **Reproducibility**: Access to all evaluation codes and guidelines
84
+
85
+ ## How to Use
86
+
87
+ 1. **Select Model**: Choose between LLaMA3 or DeepSeek
88
+ 2. **Choose Setting**: Pick from Attack-free, Watermark Removal, or Stealing Attack
89
+ 3. **View Results**: Explore the scatter plot and detailed table
90
+ 4. **Submit Data**: Click "Add Your Data" to submit new results
91
+ - Submit any combination of attack types (Attack-free, Watermark Removal, Stealing Attack)
92
+ - All submissions go through approval process before appearing on leaderboard
93
+ 5. **Administrator Review**: Administrators can review pending submissions with full field visibility
94
+
95
+ ## Metrics Explained
96
+
97
+ - **Normalized Utility ↑**: Higher values indicate better text quality
98
+ - **Detection Rate (%) ↑**: Higher values indicate better watermark detection
99
+ - **Absolute Utility Degradation ↑**: Higher values indicate better resistance to removal attacks
100
+ - **Adversary BERT Score ↑**: Higher values indicate better performance under adversarial conditions
101
+
102
+ ## Contributing
103
+
104
+ We encourage researchers to contribute their evaluation results. Please follow the guidelines in the "Guidelines" section for submission requirements.
105
+
106
+ ## License
107
+
108
+ MIT License
109
+
110
+ ---
111
+ *Last updated: December 2024*
112
+ """
113
+
114
+ with open("README.md", "w", encoding="utf-8") as f:
115
+ f.write(readme_content)
116
+
117
+ print("✅ Created Hugging Face README.md")
118
+
119
+ def main():
120
+ """Main deployment function"""
121
+ print("🚀 Preparing Watermark Leaderboard for Hugging Face deployment...")
122
+
123
+ # Create HF README
124
+ create_hf_readme()
125
+
126
+ # Copy files
127
+ copy_files_to_hf_directory()
128
+
129
+ print("\n📋 Deployment Checklist:")
130
+ print("✅ All files prepared")
131
+ print("✅ Requirements.txt updated")
132
+ print("✅ README.md created for Hugging Face")
133
+ print("✅ Reproducibility codes included")
134
+
135
+ print("\n🌐 Ready for Hugging Face Spaces deployment!")
136
+
137
+ if __name__ == "__main__":
138
+ main()
index.html ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8" />
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0" />
6
+ <title>Watermark Leaderboard</title>
7
+ <script type="module" crossorigin src="./assets/index-Cd6CRo7g.js"></script>
8
+ <link rel="stylesheet" crossorigin href="./assets/index-tTSI8ghR.css">
9
+ </head>
10
+ <body>
11
+ <div id="root"></div>
12
+ </body>
13
+ </html>
leaderboard.json ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "name": "KGW",
4
+ "model": "LLaMA3",
5
+ "normalizedUtility": 0.601,
6
+ "detectionRate": 91.43,
7
+ "removal_detectionRate": 3.9,
8
+ "absoluteUtilityDegregation": 0.028999999999999915,
9
+ "adversaryBERTscore": 0.785,
10
+ "adversaryDetectionRate": 0.72
11
+ },
12
+ {
13
+ "name": "SIR",
14
+ "model": "LLaMA3",
15
+ "normalizedUtility": 0.596,
16
+ "detectionRate": 82.98,
17
+ "removal_detectionRate": 22.35,
18
+ "absoluteUtilityDegregation": 0.026499999999999968,
19
+ "adversaryBERTscore": 0.785,
20
+ "adversaryDetectionRate": 64.09
21
+ },
22
+ {
23
+ "name": "SynthID",
24
+ "model": "LLaMA3",
25
+ "normalizedUtility": 0.591,
26
+ "detectionRate": 49.76,
27
+ "removal_detectionRate": 1.435,
28
+ "absoluteUtilityDegregation": 0.02474999999999994,
29
+ "adversaryBERTscore": 0.788,
30
+ "adversaryDetectionRate": 1.4
31
+ },
32
+ {
33
+ "name": "DTM",
34
+ "model": "LLaMA3",
35
+ "normalizedUtility": 0.74,
36
+ "detectionRate": 85.64,
37
+ "removal_detectionRate": 2.835,
38
+ "absoluteUtilityDegregation": 0.05545,
39
+ "adversaryBERTscore": 0.798,
40
+ "adversaryDetectionRate": 29.1
41
+ },
42
+ {
43
+ "name": "TW",
44
+ "model": "LLaMA3",
45
+ "normalizedUtility": 0.852,
46
+ "detectionRate": 88.51,
47
+ "removal_detectionRate": 11.56,
48
+ "absoluteUtilityDegregation": 0.16169999999999995,
49
+ "adversaryBERTscore": 0.807,
50
+ "adversaryDetectionRate": 2.3
51
+ },
52
+ {
53
+ "name": "SafeSeal",
54
+ "model": "LLaMA3",
55
+ "normalizedUtility": 0.982,
56
+ "detectionRate": 89.69,
57
+ "removal_detectionRate": 37.63,
58
+ "absoluteUtilityDegregation": 0.28105,
59
+ "adversaryBERTscore": 0.788,
60
+ "adversaryDetectionRate": 69.23
61
+ },
62
+ {
63
+ "name": "KGW",
64
+ "model": "DeepSeek",
65
+ "normalizedUtility": 0.602,
66
+ "detectionRate": 82.81,
67
+ "removal_detectionRate": 1.98,
68
+ "absoluteUtilityDegregation": 0.026499999999999968,
69
+ "adversaryBERTscore": 0.777,
70
+ "adversaryDetectionRate": 0.11
71
+ },
72
+ {
73
+ "name": "SIR",
74
+ "model": "DeepSeek",
75
+ "normalizedUtility": 0.594,
76
+ "detectionRate": 95.8,
77
+ "removal_detectionRate": 55.16,
78
+ "absoluteUtilityDegregation": 0.03005000000000002,
79
+ "adversaryBERTscore": 0.767,
80
+ "adversaryDetectionRate": 67.38
81
+ },
82
+ {
83
+ "name": "SynthID",
84
+ "model": "DeepSeek",
85
+ "normalizedUtility": 0.566,
86
+ "detectionRate": 49.24,
87
+ "removal_detectionRate": 1.49,
88
+ "absoluteUtilityDegregation": 0.018500000000000072,
89
+ "adversaryBERTscore": 0.77,
90
+ "adversaryDetectionRate": 1.4
91
+ },
92
+ {
93
+ "name": "DTM",
94
+ "model": "DeepSeek",
95
+ "normalizedUtility": 0.748,
96
+ "detectionRate": 89.7,
97
+ "removal_detectionRate": 8.54,
98
+ "absoluteUtilityDegregation": 0.054850000000000065,
99
+ "adversaryBERTscore": 0.779,
100
+ "adversaryDetectionRate": 3.5
101
+ },
102
+ {
103
+ "name": "TW",
104
+ "model": "DeepSeek",
105
+ "normalizedUtility": 0.857,
106
+ "detectionRate": 85.22,
107
+ "removal_detectionRate": 15.39,
108
+ "absoluteUtilityDegregation": 0.15155000000000007,
109
+ "adversaryBERTscore": 0.776,
110
+ "adversaryDetectionRate": 59.5
111
+ },
112
+ {
113
+ "name": "SafeSeal",
114
+ "model": "DeepSeek",
115
+ "normalizedUtility": 0.981,
116
+ "detectionRate": 90.92,
117
+ "removal_detectionRate": 60.05,
118
+ "absoluteUtilityDegregation": 0.26935,
119
+ "adversaryBERTscore": 0.778,
120
+ "adversaryDetectionRate": 74.1
121
+ }
122
+ ]
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ gradio>=4.44.0
2
+ pandas>=1.5.0
3
+ plotly>=5.0.0
4
+ numpy>=1.21.0
test.html ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>Test Page</title>
7
+ </head>
8
+ <body>
9
+ <h1>Test Page</h1>
10
+ <p>If you can see this, the static deployment is working.</p>
11
+ <p>Current time: <span id="time"></span></p>
12
+ <script>
13
+ document.getElementById('time').textContent = new Date().toLocaleString();
14
+ </script>
15
+ </body>
16
+ </html>