File size: 2,826 Bytes
529c599
 
 
 
a0b0dc8
529c599
 
 
9b26982
 
 
 
 
 
 
529c599
 
15f5e73
0e45313
15bdb22
0e45313
fbf73be
 
234999f
 
 
 
 
 
2083edc
234999f
2083edc
234999f
 
 
 
5b17aa6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fbf73be
 
 
5b17aa6
0e45313
5b17aa6
 
 
 
 
fbf73be
 
0e45313
5b17aa6
1b4a4ab
5b17aa6
 
1b4a4ab
 
 
5b17aa6
fbf73be
5b17aa6
5881b67
5b17aa6
5881b67
 
5b17aa6
 
0e45313
fbf73be
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
---
title: Redac
emoji: πŸ›‘οΈ
colorFrom: blue
colorTo: gray
sdk: docker
pinned: false
license: mit
tags:
  - security
  - privacy
  - nlp
  - pii-redaction
  - privacy-protection
  - french-nlp
---

# πŸ›‘οΈ Redac

A lightweight PII (Personally Identifiable Information) moderation MVP designed to sanitize sensitive data before it reaches LLM APIs.

---

## πŸŽ₯ Demo

Check out Redac in action:

<div align="center">
  <a href="https://www.youtube.com/shorts/OkwsoL4H5cc">
    <img src="https://img.youtube.com/vi/OkwsoL4H5cc/maxresdefault.jpg" alt="Redac Demo Video" width="600" style="border-radius: 20px; box-shadow: 0 10px 30px rgba(0,0,0,0.5);">
  </a>
  <p><i>Click to watch the full demo on YouTube</i></p>
</div>

---

## πŸ“– API Documentation

The Redac API is open and can be integrated into your own workflows.

### 🏠 Base URL
`https://lbl-redaction.hf.space/api`

### ⚑ Rate Limiting
- **1 request every 2 seconds** per IP address to ensure stability.
- Exceeding this limit will return a `429 Too Many Requests` status code with a helpful message.

### πŸ” Endpoints

#### 1. Redact Text
Processes a text and returns the anonymized version along with metadata about detected entities.

- **URL**: `/redact`
- **Method**: `POST`
- **Headers**: `Content-Type: application/json`
- **Body**:
  ```json
  {
    "text": "Your sensitive text here",
    "language": "auto" 
  }
  ```
  *(Options for language: `auto`, `en`, `fr`)*

- **Success Response (200 OK)**:
  ```json
  {
    "original_text": "...",
    "redacted_text": "My name is <PERSON>",
    "detected_language": "en",
    "entities": [
      {
        "type": "PERSON",
        "text": "John Doe",
        "score": 95,
        "start": 11,
        "end": 19
      }
    ]
  }
  ```

#### 2. System Status
Checks if the API and NLP engines are online.

- **URL**: `/status`
- **Method**: `GET`

---

## πŸš€ Key Features

- **Multi-Language Support**: High-accuracy detection for **English** and **French** using `spaCy` Large models.
- **Double-Pass Protection**: Combines NLP-based detection with expert Regex patterns for PII coverage.
- **Expert French Recognizers**: Built-in support for French-specific data: **SIRET**, **NIR**, **IBAN**, and addresses.
- **Balanced Anonymization**: Preserves job titles and document structure to keep texts readable.
- **Minimal Dashboard**: React-based UI with Risk Assessment visualization.

---

## πŸ› οΈ Architecture

1.  **Core API (`/api`)**: FastAPI server powered by **Microsoft Presidio**.
2.  **Web Dashboard (`/ui`)**: React + Vite + Tailwind CSS.

---

## πŸ“¦ Local Development

### Manual Docker commands
```bash
docker compose up --build
```

- **API**: `http://localhost:8000/api`
- **UI Dashboard**: `http://localhost:5173`

---

## πŸ“„ License
MIT - Created for secure LLM workflows.