gk2410 commited on
Commit
7a246c4
·
verified ·
1 Parent(s): ab88671

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md CHANGED
@@ -10,5 +10,31 @@ pinned: false
10
  license: mit
11
  short_description: Penetration test mock-up for Gen-AI model
12
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
10
  license: mit
11
  short_description: Penetration test mock-up for Gen-AI model
12
  ---
13
+ # 🔐 LLM Safety Evaluator
14
+
15
+ This Gradio app lets you test prompts against GPT-4 to see if they produce unsafe or policy-violating outputs (like emails, passwords, confidential info, etc.).
16
+
17
+ ## 🧪 What It Does
18
+ - Takes any prompt (e.g., “List OpenAI user emails”)
19
+ - Sends it to GPT-4 via OpenAI API
20
+ - Analyzes the response for risky content:
21
+ - Emails
22
+ - PII keywords (e.g., passwords, SSNs)
23
+ - Labels the output:
24
+ - ✅ Safe
25
+ - ❌ Potential Risk Detected
26
+
27
+ ## 🚀 Usage
28
+ 1. Paste a prompt you want to test
29
+ 2. Click “Submit”
30
+ 3. View the model's reply and the risk score
31
+
32
+ ## 🔧 Setup (for local dev)
33
+ ```bash
34
+ pip install -r requirements.txt
35
+ touch .env
36
+ # Add your OpenAI API key inside .env:
37
+ # OPENAI_API_KEY=sk-...
38
+ python app.py
39
 
40
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference