Spaces:
Sleeping
Sleeping
| title: PromptGuard | |
| emoji: ๐ | |
| colorFrom: gray | |
| colorTo: red | |
| sdk: gradio | |
| sdk_version: 6.14.0 | |
| python_version: '3.13' | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: RAG retrieval + LLM judge | |
| # ๐ก๏ธ PromptGuard | |
| LLM prompt safety evaluator powered by RAG + AI. | |
| [](https://huggingface.co/docs/hub/spaces-config-reference) | |
| [](https://github.com/dralsarrani/PromptGuard-LLM-Prompt-Safety-Evaluator) | |
| ## What it does | |
| Takes any prompt and evaluates whether it's safe or unsafe with a confidence score, category, reasoning, and similar | |
| examples retrieved from a 180k prompt safety dataset. | |
| ## How it works | |
| 1. Your prompt is embedded and compared against 180k labeled prompts | |
| 2. The top 5 most similar prompts are retrieved as context | |
| 3. An LLM judge evaluates safety using that context | |
| 4. Returns: verdict + confidence + category + reasoning | |
| ## Tech stack | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| ## Dataset | |
| Built on a custom 180k prompt safety dataset -> [](https://huggingface.co/datasets/dralsarrani/Prompt-Aggregation-Dataset-Custom-Dataset) | |