Spaces:
Sleeping
Sleeping
A newer version of the Streamlit SDK is available:
1.54.0
metadata
title: SafeSeal Watermark
emoji: π
colorFrom: blue
colorTo: purple
sdk: streamlit
sdk_version: 1.50.0
app_file: app.py
pinned: false
license: mit
SafeSeal Watermark
Content-Preserving Watermarking for Large Language Model Deployments.
Generate watermarked text by key-conditioned sampling words with context-aware synonyms.
Features
- π Secret Key: Deterministic watermarking with user-controlled key
- π BERTScore Filtering: Adjustable similarity threshold (0.0 - 1.0)
- π Tournament Sampling: Select synonyms using tournament-based randomization
- β¨ Visual Highlighting: See exactly which words were changed
- π GPU Support: Fast inference with automatic GPU detection
- π‘οΈ Smart Filtering: Excludes antonyms, specific nouns in same category, and preserves entity names
How It Works
- Entity Detection: Extracts eligible words (nouns, verbs, adjectives, adverbs) while skipping named entities
- Candidate Generation: Uses RoBERTa-base to generate semantically similar alternatives
- BERTScore Filtering: Evaluates candidates against a similarity threshold
- Tournament Selection: Deterministically selects replacements based on secret key
- Visualization: Highlights changed words in the output
Usage
- Enter your text in the left panel
- Adjust hyperparameters in the sidebar:
- Secret Key: Used for deterministic randomization
- Threshold: Similarity threshold (default: 0.98)
- Tournament parameters: Fine-tune the selection process
- Click "π Generate Watermark"
- View the watermarked text with highlighted changes
Parameters
- Secret Key: Used for deterministic randomization
- Threshold (0.98): BERTScore similarity threshold - higher = more conservative changes
- m (10): Number of tournament rounds
- c (2): Competitors per tournament match
- h (6): Context size (left tokens to consider)
- Alpha (1.1): Temperature scaling factor
Technical Details
- Model: RoBERTa-base for masked language modeling
- Similarity Scoring: BERTScore F1 scores
- Selection: Tournament-based deterministic sampling
- Filtering: POS tag matching, antonym exclusion, semantic compatibility checks
Example
Input:
"The quick brown fox jumps over the lazy dog."
Watermarked Output:
"The swift brown fox leaps over the idle dog."
Changed words highlighted: swift (was quick), leaps (was jumps), idle (was lazy)
β οΈ Demo Version: This is a demonstration using a light model to showcase the watermarking pipeline. Results may not be perfect and are intended for testing purposes only.