sumitranjan
/

PromptShield

Text Classification

Model card Files Files and versions

sumitranjan commited on May 18, 2025

Commit

cafa05b

·

verified ·

1 Parent(s): 5d26e52

Update README.md

Files changed (1) hide show

README.md +36 -2

README.md CHANGED Viewed

@@ -12,9 +12,26 @@ license: mit
 ## 🔍 Overview
-PromptShield is a multilingual prompt classification model designed to detect **unsafe or adversarial prompts**, particularly those that may lead to prompt injection or misuse of generative AI systems. It distinguishes between *safe* and *unsafe* inputs with over **99% accuracy**.
-Whether you're deploying a chatbot, a content moderation tool, or an LLM firewall, PromptShield provides an essential layer of safety assurance.
 ---
@@ -83,3 +100,20 @@ Sumit Ranjan
 Raj Bapodra

 ## 🔍 Overview
+ 🛡️ PromptShield
+**PromptShield** is a prompt classification model designed to detect **unsafe**, **adversarial**, or **prompt injection** inputs. Built on the `xlm-roberta-base` transformer, it delivers high-accuracy performance in distinguishing between **safe** and **unsafe** prompts — achieving **99.33% accuracy** during training.
+---
+## 📌 Overview
+PromptShield is a robust binary classification model built on FacebookAI's `xlm-roberta-base`. Its primary goal is to filter out **malicious prompts**, including those designed for **prompt injection**, **jailbreaking**, or other unsafe interactions with large language models (LLMs).
+Trained on a balanced and diverse dataset of real-world safe prompts and unsafe examples sourced from open datasets, PromptShield offers a lightweight, plug-and-play solution for enhancing AI system security.
+Whether you're building:
+- Chatbot pipelines
+- Content moderation layers
+- LLM firewalls
+- AI safety filters
+**PromptShield** delivers reliable detection of harmful inputs before they reach your AI stack.
 ---
 Raj Bapodra
+🛡️ Ideal Use Cases
+LLM Firewalls & Guardrails
+AI Content Moderation
+Prompt Validation Pipelines
+Multi-Agent System Safety
+AI Red Teaming Pre-filters
+📄 License
+MIT License (or your preferred open-source license here)
+⭐️ Citation
+If you use PromptShield, please consider citing this work or linking back to the Hugging Face model page.