Spaces:

AnonymousResearch
/

SafeSeal

Sleeping

App Files Files Community

SafeSeal / README.md

kirudang

Sync SafeSeal app

fc6dcab 3 months ago

preview code

raw

history blame contribute delete

2.68 kB

	---
	title: SafeSeal Watermark
	emoji: 🔒
	colorFrom: blue
	colorTo: purple
	sdk: streamlit
	sdk_version: 1.50.0
	app_file: app.py
	pinned: false
	license: mit
	---

	# SafeSeal Watermark

	Content-Preserving Watermarking for Large Language Model Deployments.

	Generate watermarked text by key-conditioned sampling words with context-aware synonyms.

	## Features

	- 🔑 Secret Key: Deterministic watermarking with user-controlled key
	- 📊 BERTScore Filtering: Adjustable similarity threshold (0.0 - 1.0)
	- 🏆 Tournament Sampling: Select synonyms using tournament-based randomization
	- ✨ Visual Highlighting: See exactly which words were changed
	- 🚀 GPU Support: Fast inference with automatic GPU detection
	- 🛡️ Smart Filtering: Excludes antonyms, specific nouns in same category, and preserves entity names

	## How It Works

	1. Entity Detection: Extracts eligible words (nouns, verbs, adjectives, adverbs) while skipping named entities
	2. Candidate Generation: Uses RoBERTa-base to generate semantically similar alternatives
	3. BERTScore Filtering: Evaluates candidates against a similarity threshold
	4. Tournament Selection: Deterministically selects replacements based on secret key
	5. Visualization: Highlights changed words in the output

	## Usage

	1. Enter your text in the left panel
	2. Adjust hyperparameters in the sidebar:
	- Secret Key: Used for deterministic randomization
	- Threshold: Similarity threshold (default: 0.98)
	- Tournament parameters: Fine-tune the selection process
	3. Click "🚀 Generate Watermark"
	4. View the watermarked text with highlighted changes

	## Parameters

	- Secret Key: Used for deterministic randomization
	- Threshold (0.98): BERTScore similarity threshold - higher = more conservative changes
	- m (10): Number of tournament rounds
	- c (2): Competitors per tournament match
	- h (6): Context size (left tokens to consider)
	- Alpha (1.1): Temperature scaling factor

	## Technical Details

	- Model: RoBERTa-base for masked language modeling
	- Similarity Scoring: BERTScore F1 scores
	- Selection: Tournament-based deterministic sampling
	- Filtering: POS tag matching, antonym exclusion, semantic compatibility checks

	## Example

	Input:
	> "The quick brown fox jumps over the lazy dog."

	Watermarked Output:
	> "The swift brown fox leaps over the idle dog."

	Changed words highlighted: swift (was quick), leaps (was jumps), idle (was lazy)

	⚠️ Demo Version: This is a demonstration using a light model to showcase the watermarking pipeline. Results may not be perfect and are intended for testing purposes only.

	---
	title: SafeSeal Watermark
	emoji: 🔒
	colorFrom: blue
	colorTo: purple
	sdk: streamlit
	sdk_version: 1.50.0
	app_file: app.py
	pinned: false
	license: mit
	---

	# SafeSeal Watermark

	Content-Preserving Watermarking for Large Language Model Deployments.

	Generate watermarked text by key-conditioned sampling words with context-aware synonyms.

	## Features

	- 🔑 Secret Key: Deterministic watermarking with user-controlled key
	- 📊 BERTScore Filtering: Adjustable similarity threshold (0.0 - 1.0)
	- 🏆 Tournament Sampling: Select synonyms using tournament-based randomization
	- ✨ Visual Highlighting: See exactly which words were changed
	- 🚀 GPU Support: Fast inference with automatic GPU detection
	- 🛡️ Smart Filtering: Excludes antonyms, specific nouns in same category, and preserves entity names

	## How It Works

	1. Entity Detection: Extracts eligible words (nouns, verbs, adjectives, adverbs) while skipping named entities
	2. Candidate Generation: Uses RoBERTa-base to generate semantically similar alternatives
	3. BERTScore Filtering: Evaluates candidates against a similarity threshold
	4. Tournament Selection: Deterministically selects replacements based on secret key
	5. Visualization: Highlights changed words in the output

	## Usage

	1. Enter your text in the left panel
	2. Adjust hyperparameters in the sidebar:
	- Secret Key: Used for deterministic randomization
	- Threshold: Similarity threshold (default: 0.98)
	- Tournament parameters: Fine-tune the selection process
	3. Click "🚀 Generate Watermark"
	4. View the watermarked text with highlighted changes

	## Parameters

	- Secret Key: Used for deterministic randomization
	- Threshold (0.98): BERTScore similarity threshold - higher = more conservative changes
	- m (10): Number of tournament rounds
	- c (2): Competitors per tournament match
	- h (6): Context size (left tokens to consider)
	- Alpha (1.1): Temperature scaling factor

	## Technical Details

	- Model: RoBERTa-base for masked language modeling
	- Similarity Scoring: BERTScore F1 scores
	- Selection: Tournament-based deterministic sampling
	- Filtering: POS tag matching, antonym exclusion, semantic compatibility checks

	## Example

	Input:
	> "The quick brown fox jumps over the lazy dog."

	Watermarked Output:
	> "The swift brown fox leaps over the idle dog."

	Changed words highlighted: swift (was quick), leaps (was jumps), idle (was lazy)

	⚠️ Demo Version: This is a demonstration using a light model to showcase the watermarking pipeline. Results may not be perfect and are intended for testing purposes only.