Spaces:

sentinelseed
/

sentinel-demo

Running

sentinel-demo / README.md

Add README.md

5b02e22 verified 5 days ago

1.57 kB

	---
	title: Sentinel Seed Demo
	emoji: 🛡️
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 4.44.0
	app_file: app.py
	pinned: false
	license: mit
	short_description: Test AI alignment seeds in real-time
	---

	# Sentinel Seed Demo

	Interactive demo for testing AI alignment seeds. Compare how language models respond with and without safety seeds.

	## Features

	- Side-by-side comparison: See baseline vs protected responses
	- THSP Analysis: Real-time gate analysis (Truth, Harm, Scope, Purpose)
	- Multiple seeds: Test Sentinel v2 Standard and Minimal
	- Pre-built scenarios: 8 test cases covering various attack vectors

	## The THSP Protocol

	Every request passes through four gates:

	1. TRUTH - No deception or misinformation
	2. HARM - No enabling physical, psychological, or digital damage
	3. SCOPE - Stay within appropriate boundaries
	4. PURPOSE - Every action must serve legitimate benefit

	All gates must pass for an action to proceed.

	## Benchmark Results

	\| Benchmark \| Baseline \| With Seed \| Delta \|
	\|-----------\|----------\|-----------\|-------\|
	\| HarmBench \| 86.5% \| 98.2% \| +11.7% \|
	\| JailbreakBench \| 88% \| 97.3% \| +9.3% \|
	\| GDS-12 \| 78% \| 92% \| +14% \|

	## Links

	- [Website](https://sentinelseed.dev)
	- [Documentation](https://sentinelseed.dev/docs)
	- [Sentinel Lab](https://sentinelseed.dev/evaluations)
	- [Dataset](https://huggingface.co/datasets/sentinelseed/sentinel-benchmarks)
	- [GitHub](https://github.com/sentinel-seed)

	## License

	MIT License - Sentinel Team