sentinelseed commited on
Commit
5b02e22
·
verified ·
1 Parent(s): ed6acc8

Add README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -12
README.md CHANGED
@@ -1,12 +1,54 @@
1
- ---
2
- title: Sentinel Demo
3
- emoji:
4
- colorFrom: pink
5
- colorTo: green
6
- sdk: gradio
7
- sdk_version: 6.1.0
8
- app_file: app.py
9
- pinned: false
10
- ---
11
-
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Sentinel Seed Demo
3
+ emoji: 🛡️
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 4.44.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ short_description: Test AI alignment seeds in real-time
12
+ ---
13
+
14
+ # Sentinel Seed Demo
15
+
16
+ Interactive demo for testing AI alignment seeds. Compare how language models respond with and without safety seeds.
17
+
18
+ ## Features
19
+
20
+ - **Side-by-side comparison**: See baseline vs protected responses
21
+ - **THSP Analysis**: Real-time gate analysis (Truth, Harm, Scope, Purpose)
22
+ - **Multiple seeds**: Test Sentinel v2 Standard and Minimal
23
+ - **Pre-built scenarios**: 8 test cases covering various attack vectors
24
+
25
+ ## The THSP Protocol
26
+
27
+ Every request passes through four gates:
28
+
29
+ 1. **TRUTH** - No deception or misinformation
30
+ 2. **HARM** - No enabling physical, psychological, or digital damage
31
+ 3. **SCOPE** - Stay within appropriate boundaries
32
+ 4. **PURPOSE** - Every action must serve legitimate benefit
33
+
34
+ All gates must pass for an action to proceed.
35
+
36
+ ## Benchmark Results
37
+
38
+ | Benchmark | Baseline | With Seed | Delta |
39
+ |-----------|----------|-----------|-------|
40
+ | HarmBench | 86.5% | 98.2% | +11.7% |
41
+ | JailbreakBench | 88% | 97.3% | +9.3% |
42
+ | GDS-12 | 78% | 92% | +14% |
43
+
44
+ ## Links
45
+
46
+ - [Website](https://sentinelseed.dev)
47
+ - [Documentation](https://sentinelseed.dev/docs)
48
+ - [Sentinel Lab](https://sentinelseed.dev/evaluations)
49
+ - [Dataset](https://huggingface.co/datasets/sentinelseed/sentinel-benchmarks)
50
+ - [GitHub](https://github.com/sentinel-seed)
51
+
52
+ ## License
53
+
54
+ MIT License - Sentinel Team