--- title: PromptGuard emoji: 🚀 colorFrom: gray colorTo: red sdk: gradio sdk_version: 6.14.0 python_version: '3.13' app_file: app.py pinned: false license: apache-2.0 short_description: RAG retrieval + LLM judge --- # 🛡️ PromptGuard LLM prompt safety evaluator powered by RAG + AI. [![HF Config Docs](https://img.shields.io/badge/HF%20Spaces-Config%20Reference-yellow?logo=huggingface)](https://huggingface.co/docs/hub/spaces-config-reference) [![GitHub Repo](https://img.shields.io/badge/GitHub-PromptGuard%20Repo-blue?logo=github)](https://github.com/dralsarrani/PromptGuard-LLM-Prompt-Safety-Evaluator) ## What it does Takes any prompt and evaluates whether it's safe or unsafe with a confidence score, category, reasoning, and similar examples retrieved from a 180k prompt safety dataset. ## How it works 1. Your prompt is embedded and compared against 180k labeled prompts 2. The top 5 most similar prompts are retrieved as context 3. An LLM judge evaluates safety using that context 4. Returns: verdict + confidence + category + reasoning ## Tech stack ![Python](https://img.shields.io/badge/Python-3.10+-blue?logo=python) ![ChromaDB](https://img.shields.io/badge/ChromaDB-Vector%20Store-orange) ![SentenceTransformers](https://img.shields.io/badge/Sentence--Transformers-Embeddings-green) ![OpenRouter](https://img.shields.io/badge/OpenRouter-LLM%20API-purple) ![Gradio](https://img.shields.io/badge/Gradio-UI%20Framework-yellow?logo=gradio) ![HuggingFace](https://img.shields.io/badge/HuggingFace-Spaces-black?logo=huggingface) ## Dataset Built on a custom 180k prompt safety dataset -> [![Dataset](https://img.shields.io/badge/HF%20Dataset-Prompt%20Aggregation-blue?logo=huggingface)](https://huggingface.co/datasets/dralsarrani/Prompt-Aggregation-Dataset-Custom-Dataset)