sentinel-demo / README.md
sentinelseed's picture
Add README.md
5b02e22 verified
|
raw
history blame
1.57 kB
---
title: Sentinel Seed Demo
emoji: 🛡️
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
short_description: Test AI alignment seeds in real-time
---
# Sentinel Seed Demo
Interactive demo for testing AI alignment seeds. Compare how language models respond with and without safety seeds.
## Features
- **Side-by-side comparison**: See baseline vs protected responses
- **THSP Analysis**: Real-time gate analysis (Truth, Harm, Scope, Purpose)
- **Multiple seeds**: Test Sentinel v2 Standard and Minimal
- **Pre-built scenarios**: 8 test cases covering various attack vectors
## The THSP Protocol
Every request passes through four gates:
1. **TRUTH** - No deception or misinformation
2. **HARM** - No enabling physical, psychological, or digital damage
3. **SCOPE** - Stay within appropriate boundaries
4. **PURPOSE** - Every action must serve legitimate benefit
All gates must pass for an action to proceed.
## Benchmark Results
| Benchmark | Baseline | With Seed | Delta |
|-----------|----------|-----------|-------|
| HarmBench | 86.5% | 98.2% | +11.7% |
| JailbreakBench | 88% | 97.3% | +9.3% |
| GDS-12 | 78% | 92% | +14% |
## Links
- [Website](https://sentinelseed.dev)
- [Documentation](https://sentinelseed.dev/docs)
- [Sentinel Lab](https://sentinelseed.dev/evaluations)
- [Dataset](https://huggingface.co/datasets/sentinelseed/sentinel-benchmarks)
- [GitHub](https://github.com/sentinel-seed)
## License
MIT License - Sentinel Team