Spaces:
Running
Running
| title: Sentinel Seed Demo | |
| emoji: 🛡️ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 4.44.0 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| short_description: Test AI alignment seeds in real-time | |
| # Sentinel Seed Demo | |
| Interactive demo for testing AI alignment seeds. Compare how language models respond with and without safety seeds. | |
| ## Features | |
| - **Side-by-side comparison**: See baseline vs protected responses | |
| - **THSP Analysis**: Real-time gate analysis (Truth, Harm, Scope, Purpose) | |
| - **Multiple seeds**: Test Sentinel v2 Standard and Minimal | |
| - **Pre-built scenarios**: 8 test cases covering various attack vectors | |
| ## The THSP Protocol | |
| Every request passes through four gates: | |
| 1. **TRUTH** - No deception or misinformation | |
| 2. **HARM** - No enabling physical, psychological, or digital damage | |
| 3. **SCOPE** - Stay within appropriate boundaries | |
| 4. **PURPOSE** - Every action must serve legitimate benefit | |
| All gates must pass for an action to proceed. | |
| ## Benchmark Results | |
| | Benchmark | Baseline | With Seed | Delta | | |
| |-----------|----------|-----------|-------| | |
| | HarmBench | 86.5% | 98.2% | +11.7% | | |
| | JailbreakBench | 88% | 97.3% | +9.3% | | |
| | GDS-12 | 78% | 92% | +14% | | |
| ## Links | |
| - [Website](https://sentinelseed.dev) | |
| - [Documentation](https://sentinelseed.dev/docs) | |
| - [Sentinel Lab](https://sentinelseed.dev/evaluations) | |
| - [Dataset](https://huggingface.co/datasets/sentinelseed/sentinel-benchmarks) | |
| - [GitHub](https://github.com/sentinel-seed) | |
| ## License | |
| MIT License - Sentinel Team | |