Spaces:
Sleeping
Sleeping
| title: SafeSeal Watermark | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: streamlit | |
| sdk_version: 1.50.0 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| # SafeSeal Watermark | |
| **Content-Preserving Watermarking for Large Language Model Deployments.** | |
| Generate watermarked text by key-conditioned sampling words with context-aware synonyms. | |
| ## Features | |
| - π **Secret Key**: Deterministic watermarking with user-controlled key | |
| - π **BERTScore Filtering**: Adjustable similarity threshold (0.0 - 1.0) | |
| - π **Tournament Sampling**: Select synonyms using tournament-based randomization | |
| - β¨ **Visual Highlighting**: See exactly which words were changed | |
| - π **GPU Support**: Fast inference with automatic GPU detection | |
| - π‘οΈ **Smart Filtering**: Excludes antonyms, specific nouns in same category, and preserves entity names | |
| ## How It Works | |
| 1. **Entity Detection**: Extracts eligible words (nouns, verbs, adjectives, adverbs) while skipping named entities | |
| 2. **Candidate Generation**: Uses RoBERTa-base to generate semantically similar alternatives | |
| 3. **BERTScore Filtering**: Evaluates candidates against a similarity threshold | |
| 4. **Tournament Selection**: Deterministically selects replacements based on secret key | |
| 5. **Visualization**: Highlights changed words in the output | |
| ## Usage | |
| 1. Enter your text in the left panel | |
| 2. Adjust hyperparameters in the sidebar: | |
| - **Secret Key**: Used for deterministic randomization | |
| - **Threshold**: Similarity threshold (default: 0.98) | |
| - **Tournament parameters**: Fine-tune the selection process | |
| 3. Click "π Generate Watermark" | |
| 4. View the watermarked text with highlighted changes | |
| ## Parameters | |
| - **Secret Key**: Used for deterministic randomization | |
| - **Threshold (0.98)**: BERTScore similarity threshold - higher = more conservative changes | |
| - **m (10)**: Number of tournament rounds | |
| - **c (2)**: Competitors per tournament match | |
| - **h (6)**: Context size (left tokens to consider) | |
| - **Alpha (1.1)**: Temperature scaling factor | |
| ## Technical Details | |
| - **Model**: RoBERTa-base for masked language modeling | |
| - **Similarity Scoring**: BERTScore F1 scores | |
| - **Selection**: Tournament-based deterministic sampling | |
| - **Filtering**: POS tag matching, antonym exclusion, semantic compatibility checks | |
| ## Example | |
| **Input:** | |
| > "The quick brown fox jumps over the lazy dog." | |
| **Watermarked Output:** | |
| > "The swift brown fox leaps over the idle dog." | |
| Changed words highlighted: swift (was quick), leaps (was jumps), idle (was lazy) | |
| β οΈ **Demo Version**: This is a demonstration using a light model to showcase the watermarking pipeline. Results may not be perfect and are intended for testing purposes only. | |