sumitranjan commited on
Commit
cafa05b
·
verified ·
1 Parent(s): 5d26e52

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -2
README.md CHANGED
@@ -12,9 +12,26 @@ license: mit
12
 
13
  ## 🔍 Overview
14
 
15
- PromptShield is a multilingual prompt classification model designed to detect **unsafe or adversarial prompts**, particularly those that may lead to prompt injection or misuse of generative AI systems. It distinguishes between *safe* and *unsafe* inputs with over **99% accuracy**.
16
 
17
- Whether you're deploying a chatbot, a content moderation tool, or an LLM firewall, PromptShield provides an essential layer of safety assurance.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
  ---
20
 
@@ -83,3 +100,20 @@ Sumit Ranjan
83
 
84
  Raj Bapodra
85
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  ## 🔍 Overview
14
 
15
+ 🛡️ PromptShield
16
 
17
+ **PromptShield** is a prompt classification model designed to detect **unsafe**, **adversarial**, or **prompt injection** inputs. Built on the `xlm-roberta-base` transformer, it delivers high-accuracy performance in distinguishing between **safe** and **unsafe** prompts — achieving **99.33% accuracy** during training.
18
+
19
+ ---
20
+
21
+ ## 📌 Overview
22
+
23
+ PromptShield is a robust binary classification model built on FacebookAI's `xlm-roberta-base`. Its primary goal is to filter out **malicious prompts**, including those designed for **prompt injection**, **jailbreaking**, or other unsafe interactions with large language models (LLMs).
24
+
25
+ Trained on a balanced and diverse dataset of real-world safe prompts and unsafe examples sourced from open datasets, PromptShield offers a lightweight, plug-and-play solution for enhancing AI system security.
26
+
27
+ Whether you're building:
28
+
29
+ - Chatbot pipelines
30
+ - Content moderation layers
31
+ - LLM firewalls
32
+ - AI safety filters
33
+
34
+ **PromptShield** delivers reliable detection of harmful inputs before they reach your AI stack.
35
 
36
  ---
37
 
 
100
 
101
  Raj Bapodra
102
 
103
+ 🛡️ Ideal Use Cases
104
+ LLM Firewalls & Guardrails
105
+
106
+ AI Content Moderation
107
+
108
+ Prompt Validation Pipelines
109
+
110
+ Multi-Agent System Safety
111
+
112
+ AI Red Teaming Pre-filters
113
+
114
+ 📄 License
115
+ MIT License (or your preferred open-source license here)
116
+
117
+ ⭐️ Citation
118
+ If you use PromptShield, please consider citing this work or linking back to the Hugging Face model page.
119
+