redaction / README.md
gni
feat: update rate limit to 1/2s, custom error message, and SEO PNG placeholder
5b17aa6
---
title: Redac
emoji: πŸ›‘οΈ
colorFrom: blue
colorTo: gray
sdk: docker
pinned: false
license: mit
tags:
- security
- privacy
- nlp
- pii-redaction
- privacy-protection
- french-nlp
---
# πŸ›‘οΈ Redac
A lightweight PII (Personally Identifiable Information) moderation MVP designed to sanitize sensitive data before it reaches LLM APIs.
---
## πŸŽ₯ Demo
Check out Redac in action:
<div align="center">
<a href="https://www.youtube.com/shorts/OkwsoL4H5cc">
<img src="https://img.youtube.com/vi/OkwsoL4H5cc/maxresdefault.jpg" alt="Redac Demo Video" width="600" style="border-radius: 20px; box-shadow: 0 10px 30px rgba(0,0,0,0.5);">
</a>
<p><i>Click to watch the full demo on YouTube</i></p>
</div>
---
## πŸ“– API Documentation
The Redac API is open and can be integrated into your own workflows.
### 🏠 Base URL
`https://lbl-redaction.hf.space/api`
### ⚑ Rate Limiting
- **1 request every 2 seconds** per IP address to ensure stability.
- Exceeding this limit will return a `429 Too Many Requests` status code with a helpful message.
### πŸ” Endpoints
#### 1. Redact Text
Processes a text and returns the anonymized version along with metadata about detected entities.
- **URL**: `/redact`
- **Method**: `POST`
- **Headers**: `Content-Type: application/json`
- **Body**:
```json
{
"text": "Your sensitive text here",
"language": "auto"
}
```
*(Options for language: `auto`, `en`, `fr`)*
- **Success Response (200 OK)**:
```json
{
"original_text": "...",
"redacted_text": "My name is <PERSON>",
"detected_language": "en",
"entities": [
{
"type": "PERSON",
"text": "John Doe",
"score": 95,
"start": 11,
"end": 19
}
]
}
```
#### 2. System Status
Checks if the API and NLP engines are online.
- **URL**: `/status`
- **Method**: `GET`
---
## πŸš€ Key Features
- **Multi-Language Support**: High-accuracy detection for **English** and **French** using `spaCy` Large models.
- **Double-Pass Protection**: Combines NLP-based detection with expert Regex patterns for PII coverage.
- **Expert French Recognizers**: Built-in support for French-specific data: **SIRET**, **NIR**, **IBAN**, and addresses.
- **Balanced Anonymization**: Preserves job titles and document structure to keep texts readable.
- **Minimal Dashboard**: React-based UI with Risk Assessment visualization.
---
## πŸ› οΈ Architecture
1. **Core API (`/api`)**: FastAPI server powered by **Microsoft Presidio**.
2. **Web Dashboard (`/ui`)**: React + Vite + Tailwind CSS.
---
## πŸ“¦ Local Development
### Manual Docker commands
```bash
docker compose up --build
```
- **API**: `http://localhost:8000/api`
- **UI Dashboard**: `http://localhost:5173`
---
## πŸ“„ License
MIT - Created for secure LLM workflows.