--- title: SafeSeal Watermark emoji: 🔒 colorFrom: blue colorTo: purple sdk: streamlit sdk_version: 1.50.0 app_file: app.py pinned: false license: mit --- # SafeSeal Watermark **Content-Preserving Watermarking for Large Language Model Deployments.** Generate watermarked text by key-conditioned sampling words with context-aware synonyms. ## Features - 🔑 **Secret Key**: Deterministic watermarking with user-controlled key - 📊 **BERTScore Filtering**: Adjustable similarity threshold (0.0 - 1.0) - 🏆 **Tournament Sampling**: Select synonyms using tournament-based randomization - ✨ **Visual Highlighting**: See exactly which words were changed - 🚀 **GPU Support**: Fast inference with automatic GPU detection - 🛡️ **Smart Filtering**: Excludes antonyms, specific nouns in same category, and preserves entity names ## How It Works 1. **Entity Detection**: Extracts eligible words (nouns, verbs, adjectives, adverbs) while skipping named entities 2. **Candidate Generation**: Uses RoBERTa-base to generate semantically similar alternatives 3. **BERTScore Filtering**: Evaluates candidates against a similarity threshold 4. **Tournament Selection**: Deterministically selects replacements based on secret key 5. **Visualization**: Highlights changed words in the output ## Usage 1. Enter your text in the left panel 2. Adjust hyperparameters in the sidebar: - **Secret Key**: Used for deterministic randomization - **Threshold**: Similarity threshold (default: 0.98) - **Tournament parameters**: Fine-tune the selection process 3. Click "🚀 Generate Watermark" 4. View the watermarked text with highlighted changes ## Parameters - **Secret Key**: Used for deterministic randomization - **Threshold (0.98)**: BERTScore similarity threshold - higher = more conservative changes - **m (10)**: Number of tournament rounds - **c (2)**: Competitors per tournament match - **h (6)**: Context size (left tokens to consider) - **Alpha (1.1)**: Temperature scaling factor ## Technical Details - **Model**: RoBERTa-base for masked language modeling - **Similarity Scoring**: BERTScore F1 scores - **Selection**: Tournament-based deterministic sampling - **Filtering**: POS tag matching, antonym exclusion, semantic compatibility checks ## Example **Input:** > "The quick brown fox jumps over the lazy dog." **Watermarked Output:** > "The swift brown fox leaps over the idle dog." Changed words highlighted: swift (was quick), leaps (was jumps), idle (was lazy) ⚠️ **Demo Version**: This is a demonstration using a light model to showcase the watermarking pipeline. Results may not be perfect and are intended for testing purposes only.