---
title: SafeSeal Watermark
emoji: 🔒
colorFrom: blue
colorTo: purple
sdk: streamlit
sdk_version: 1.50.0
app_file: app.py
pinned: false
license: mit
---

# SafeSeal Watermark

**Content-Preserving Watermarking for Large Language Model Deployments.**

Generate watermarked text by key-conditioned sampling words with context-aware synonyms.

## Features

- 🔑 **Secret Key**: Deterministic watermarking with user-controlled key
- 📊 **BERTScore Filtering**: Adjustable similarity threshold (0.0 - 1.0)
- 🏆 **Tournament Sampling**: Select synonyms using tournament-based randomization
- ✨ **Visual Highlighting**: See exactly which words were changed
- 🚀 **GPU Support**: Fast inference with automatic GPU detection
- 🛡️ **Smart Filtering**: Excludes antonyms, specific nouns in same category, and preserves entity names

## How It Works

1. **Entity Detection**: Extracts eligible words (nouns, verbs, adjectives, adverbs) while skipping named entities
2. **Candidate Generation**: Uses RoBERTa-base to generate semantically similar alternatives
3. **BERTScore Filtering**: Evaluates candidates against a similarity threshold
4. **Tournament Selection**: Deterministically selects replacements based on secret key
5. **Visualization**: Highlights changed words in the output

## Usage

1. Enter your text in the left panel
2. Adjust hyperparameters in the sidebar:
   - **Secret Key**: Used for deterministic randomization
   - **Threshold**: Similarity threshold (default: 0.98)
   - **Tournament parameters**: Fine-tune the selection process
3. Click "🚀 Generate Watermark"
4. View the watermarked text with highlighted changes

## Parameters

- **Secret Key**: Used for deterministic randomization
- **Threshold (0.98)**: BERTScore similarity threshold - higher = more conservative changes
- **m (10)**: Number of tournament rounds
- **c (2)**: Competitors per tournament match
- **h (6)**: Context size (left tokens to consider)
- **Alpha (1.1)**: Temperature scaling factor

## Technical Details

- **Model**: RoBERTa-base for masked language modeling
- **Similarity Scoring**: BERTScore F1 scores
- **Selection**: Tournament-based deterministic sampling
- **Filtering**: POS tag matching, antonym exclusion, semantic compatibility checks

## Example

**Input:**
> "The quick brown fox jumps over the lazy dog."

**Watermarked Output:**
> "The swift brown fox leaps over the idle dog."

Changed words highlighted: swift (was quick), leaps (was jumps), idle (was lazy)

⚠️ **Demo Version**: This is a demonstration using a light model to showcase the watermarking pipeline. Results may not be perfect and are intended for testing purposes only.