bekzhanK1 commited on
Commit
62d55bf
·
1 Parent(s): c05c750

add readme

Browse files
Files changed (3) hide show
  1. README.md +168 -19
  2. signature/README.md +0 -118
  3. stamp_detector/README.md +0 -121
README.md CHANGED
@@ -10,28 +10,59 @@ license: mit
10
 
11
  # Document Processing Pipeline API
12
 
13
- FastAPI service for detecting QR codes, signatures, and stamps in PDF documents.
14
 
15
- ## Features
16
 
17
- - **QR Code Detection**: Detects and decodes QR codes in documents
18
- - **Signature Detection**: Uses YOLOv8s to detect signatures
19
- - **Stamp Detection**: Uses YOLOv8 to detect stamps/seals
20
- - **PDF Support**: Processes multi-page PDF documents
21
 
22
- ## API Endpoints
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
- - `POST /process-pdf` - Upload and process PDF file
25
- - `POST /process-pdf-from-url` - Process PDF from URL (S3 or HTTP/HTTPS)
26
- - `GET /docs` - Interactive API documentation
27
- - `GET /health` - Health check
 
 
 
 
 
 
28
 
29
- Visit `/docs` for interactive API documentation.
30
 
31
- ## Usage
 
32
 
33
- ### Process PDF via API
 
 
 
34
 
 
35
  ```bash
36
  curl -X POST "https://bekzhanK1-armeta-hackaton.hf.space/process-pdf" \
37
  -F "file=@document.pdf" \
@@ -39,14 +70,132 @@ curl -X POST "https://bekzhanK1-armeta-hackaton.hf.space/process-pdf" \
39
  -F "stamp_conf=0.25"
40
  ```
41
 
42
- ### Process PDF from URL
 
 
 
 
 
 
 
 
 
 
43
 
 
 
 
 
 
 
 
44
  ```bash
45
- curl -X POST "https://bekzhanK1-armeta-hackaton.hf.space/process-pdf-from-url?pdf_url=https://example.com/document.pdf"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
  ```
47
 
48
- ## Model Requirements
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
 
50
- - Signature model: Automatically downloaded from Hugging Face
51
- - Stamp model: Must be uploaded to `stamp_detector/stamp_model.pt` in this repository
52
 
 
 
10
 
11
  # Document Processing Pipeline API
12
 
13
+ A production-ready FastAPI service for automated detection and extraction of QR codes, signatures, and stamps from PDF documents. The pipeline processes multi-page PDFs sequentially through three specialized detection models and returns consolidated JSON results.
14
 
15
+ ## Overview
16
 
17
+ This API provides a unified interface for document analysis, combining multiple computer vision models to extract structured information from PDF documents. It supports concurrent processing of multiple documents and can handle both file uploads and remote PDF URLs.
 
 
 
18
 
19
+ ## Detection Models
20
+
21
+ ### 1. QR Code Detection
22
+ - **Method**: OpenCV `QRCodeDetector` (native implementation)
23
+ - **Library**: OpenCV Python (`cv2`)
24
+ - **Approach**: Multi-preprocessing pipeline with adaptive thresholding
25
+ - **Features**:
26
+ - Detects multiple QR codes per page
27
+ - Decodes QR code data automatically
28
+ - Uses CLAHE (Contrast Limited Adaptive Histogram Equalization) for enhanced detection
29
+ - Tests multiple preprocessing approaches (grayscale, binary, Otsu thresholding, inverted)
30
+ - **Output**: Bounding box coordinates, decoded data, corner points
31
+
32
+ ### 2. Signature Detection
33
+ - **Model**: YOLOv8s (Small variant)
34
+ - **Source**: `tech4humans/yolov8s-signature-detector` (Hugging Face Hub)
35
+ - **Framework**: Ultralytics YOLO
36
+ - **Architecture**: YOLOv8s - optimized for speed and accuracy balance
37
+ - **Access**: Gated model (requires Hugging Face authentication token)
38
+ - **Features**:
39
+ - Real-time signature detection
40
+ - Confidence scoring for each detection
41
+ - Bounding box coordinates with normalized values
42
+ - **Output**: Signature locations, confidence scores, bounding boxes
43
 
44
+ ### 3. Stamp Detection
45
+ - **Model**: Custom YOLOv8 model
46
+ - **Framework**: Ultralytics YOLO
47
+ - **Model File**: `stamp_model.pt` (custom trained)
48
+ - **Default Confidence Threshold**: 0.25
49
+ - **Features**:
50
+ - Detects stamps and seals on documents
51
+ - Configurable confidence threshold
52
+ - Supports custom model paths
53
+ - **Output**: Stamp locations, confidence scores, bounding boxes
54
 
55
+ ## API Endpoints
56
 
57
+ ### `POST /process-pdf`
58
+ Upload and process a PDF file directly.
59
 
60
+ **Parameters**:
61
+ - `file` (multipart/form-data): PDF file to process
62
+ - `dpi` (int, default: 200): Resolution for PDF to image conversion
63
+ - `stamp_conf` (float, default: 0.25): Confidence threshold for stamp detection
64
 
65
+ **Example**:
66
  ```bash
67
  curl -X POST "https://bekzhanK1-armeta-hackaton.hf.space/process-pdf" \
68
  -F "file=@document.pdf" \
 
70
  -F "stamp_conf=0.25"
71
  ```
72
 
73
+ ### `POST /process-pdf-advanced`
74
+ Process PDF with advanced options including custom model paths.
75
+
76
+ **Parameters**:
77
+ - `file` (multipart/form-data): PDF file to process
78
+ - `dpi` (int, default: 200): Resolution for PDF to image conversion
79
+ - `stamp_conf` (float, default: 0.25): Confidence threshold for stamp detection
80
+ - `stamp_model` (str, optional): Path to custom stamp model
81
+
82
+ ### `POST /process-pdf-from-url`
83
+ Process PDF from a remote URL (S3, HTTP, or HTTPS).
84
 
85
+ **Parameters**:
86
+ - `pdf_url` (query string): URL to PDF file
87
+ - `dpi` (int, default: 200): Resolution for PDF to image conversion
88
+ - `stamp_conf` (float, default: 0.25): Confidence threshold for stamp detection
89
+ - `stamp_model` (str, optional): Path to custom stamp model
90
+
91
+ **Example**:
92
  ```bash
93
+ curl -X POST "https://bekzhanK1-armeta-hackaton.hf.space/process-pdf-from-url?pdf_url=https://example.com/document.pdf&dpi=200"
94
+ ```
95
+
96
+ ### `GET /health`
97
+ Health check endpoint.
98
+
99
+ ### `GET /docs`
100
+ Interactive API documentation (Swagger UI).
101
+
102
+ ## Response Format
103
+
104
+ The API returns a JSON object with the following structure:
105
+
106
+ ```json
107
+ {
108
+ "pdf_file": "document.pdf",
109
+ "total_pages": 1,
110
+ "summary": {
111
+ "total_pages": 1,
112
+ "total_qr_codes": 2,
113
+ "total_signatures": 1,
114
+ "total_stamps": 1,
115
+ "total_detections": 4
116
+ },
117
+ "pages": [
118
+ {
119
+ "page_number": 1,
120
+ "image": "document_page_1.jpg",
121
+ "image_dimensions": {
122
+ "width": 1654,
123
+ "height": 2339
124
+ },
125
+ "qr_codes": [
126
+ {
127
+ "id": 1,
128
+ "x": 100,
129
+ "y": 200,
130
+ "width": 150,
131
+ "height": 150,
132
+ "data": "https://example.com"
133
+ }
134
+ ],
135
+ "signatures": [
136
+ {
137
+ "id": 1,
138
+ "confidence": 0.95,
139
+ "bbox": {
140
+ "x1": 500,
141
+ "y1": 800,
142
+ "x2": 700,
143
+ "y2": 900
144
+ }
145
+ }
146
+ ],
147
+ "stamps": [
148
+ {
149
+ "id": 1,
150
+ "confidence": 0.87,
151
+ "bbox": {
152
+ "x1": 1200,
153
+ "y1": 100,
154
+ "x2": 1400,
155
+ "y2": 300
156
+ }
157
+ }
158
+ ]
159
+ }
160
+ ]
161
+ }
162
  ```
163
 
164
+ ## Configuration
165
+
166
+ ### DPI Settings
167
+ The DPI parameter controls the resolution when converting PDF pages to images:
168
+ - **150 DPI**: Fast processing, suitable for documents with large elements
169
+ - **200 DPI** (default): Balanced speed and accuracy
170
+ - **300 DPI**: Higher accuracy for small signatures/stamps, slower processing
171
+
172
+ **Impact on Detection**:
173
+ - **QR Codes**: Moderate impact - very low DPI may miss small QR codes
174
+ - **Signatures**: High impact - small signatures require higher DPI (200-300)
175
+ - **Stamps**: High impact - small stamps require higher DPI (200-300)
176
+
177
+ ### Model Requirements
178
+
179
+ 1. **Signature Model**: Automatically downloaded from Hugging Face Hub on first use
180
+ - Requires `HF_TOKEN` environment variable for gated model access
181
+ - Set in Space Settings → Secrets
182
+
183
+ 2. **Stamp Model**: Must be uploaded to `stamp_detector/stamp_model.pt`
184
+ - Upload via Hugging Face Space web interface or Git LFS
185
+
186
+ ## Performance
187
+
188
+ - **Concurrent Processing**: Supports up to 4 parallel requests (configurable)
189
+ - **Processing Time**: Varies by document size and DPI (typically 2-10 seconds per page)
190
+ - **Memory**: Optimized for efficient model loading and image processing
191
+
192
+ ## Deployment
193
+
194
+ This API is containerized using Docker and can be deployed on:
195
+ - Hugging Face Spaces (current deployment)
196
+ - Any Docker-compatible platform
197
+ - Local development with GPU support
198
 
199
+ ## License
 
200
 
201
+ MIT License
signature/README.md DELETED
@@ -1,118 +0,0 @@
1
- # YOLOv8 Signature Detector
2
-
3
- This repository implements signature detection using the YOLOv8s model from [tech4humans/yolov8s-signature-detector](https://huggingface.co/tech4humans/yolov8s-signature-detector).
4
-
5
- ## Setup
6
-
7
- Install dependencies:
8
-
9
- ```bash
10
- pip install -r requirements.txt
11
- ```
12
-
13
- ### Authentication
14
-
15
- The model repository is gated and requires Hugging Face authentication. You need to:
16
-
17
- 1. **Login via CLI** (recommended):
18
- ```bash
19
- huggingface-cli login
20
- ```
21
- Enter your Hugging Face token when prompted. Get your token from [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)
22
-
23
- 2. **Or set environment variable**:
24
- ```bash
25
- export HF_TOKEN=your_token_here
26
- ```
27
-
28
- 3. **Or manually download the model**:
29
- ```bash
30
- huggingface-cli download tech4humans/yolov8s-signature-detector yolov8s.pt
31
- ```
32
- Then place `yolov8s.pt` in the project root directory.
33
-
34
- ## Usage
35
-
36
- ### Python Script
37
-
38
- Process all images in the `inputs/` directory:
39
-
40
- ```bash
41
- python inference.py
42
- ```
43
-
44
- The script will:
45
- 1. Check for a local `yolov8s.pt` file first
46
- 2. If not found, download the model from Hugging Face (requires authentication)
47
- 3. Process all images in the `inputs/` directory
48
- 4. Save annotated images with detected signatures to the `outputs/` directory
49
- 5. **Save signature coordinates to `outputs/signature_coordinates.json`**
50
- 6. **Crop and save individual signatures to `outputs/signatures/` directory**
51
-
52
- ### CLI (Alternative)
53
-
54
- You can also use the Ultralytics CLI:
55
-
56
- ```bash
57
- huggingface-cli download tech4humans/yolov8s-signature-detector yolov8s.pt
58
- yolo predict model=yolov8s.pt source=inputs/
59
- ```
60
-
61
- ## Model Formats
62
-
63
- The model is available in multiple formats:
64
- - `yolov8s.pt` (PyTorch format) - used by default
65
- - `yolov8s.onnx` (ONNX format) - for ONNX Runtime
66
- - `yolov8s.engine` (TensorRT format) - for TensorRT inference
67
-
68
- ## Output
69
-
70
- The script generates several outputs:
71
-
72
- 1. **Annotated images**: Images with bounding boxes around detected signatures saved to `outputs/` with the prefix `detected_`
73
- 2. **Signature coordinates JSON**: All detection coordinates saved to `outputs/signature_coordinates.json` with the following structure:
74
- ```json
75
- [
76
- {
77
- "image": "image1.jpg",
78
- "image_width": 1920,
79
- "image_height": 1080,
80
- "signatures": [
81
- {
82
- "signature_id": 1,
83
- "confidence": 0.95,
84
- "bbox": {
85
- "x1": 100.5,
86
- "y1": 200.3,
87
- "x2": 300.7,
88
- "y2": 400.9,
89
- "width": 200.2,
90
- "height": 200.6
91
- },
92
- "class_id": 0,
93
- "cropped_path": "outputs/signatures/image1_signature_1.jpg"
94
- }
95
- ]
96
- }
97
- ]
98
- ```
99
-
100
- The `image_width` and `image_height` fields allow the frontend to properly scale coordinates when displaying images at different sizes. Coordinates are in pixels relative to the original image dimensions.
101
- 3. **Cropped signatures**: Individual signature images saved to `outputs/signatures/` directory
102
-
103
- ## Extracting Signatures from Coordinates
104
-
105
- If you need to re-extract signatures using the saved coordinates, use the helper script:
106
-
107
- ```bash
108
- python extract_signatures.py
109
- ```
110
-
111
- Or specify a custom JSON file:
112
-
113
- ```bash
114
- python extract_signatures.py outputs/signature_coordinates.json
115
- ```
116
-
117
- This is useful if you want to extract signatures again without running inference, or if you need to adjust the extraction parameters.
118
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
stamp_detector/README.md DELETED
@@ -1,121 +0,0 @@
1
- # Stamp Detector
2
-
3
- Простой инструмент для детекции печатей (stamp) на изображениях с использованием YOLOv8.
4
-
5
- ## Установка
6
-
7
- ```bash
8
- pip install -r requirements.txt
9
- ```
10
-
11
- ## Использование
12
-
13
- ### Базовое использование
14
-
15
- ```bash
16
- python detect.py path/to/image.jpg
17
- ```
18
-
19
- ### С кастомным порогом уверенности
20
-
21
- ```bash
22
- python detect.py path/to/image.jpg --conf 0.20
23
- ```
24
-
25
- ### С указанием пути к модели
26
-
27
- ```bash
28
- python detect.py path/to/image.jpg --model stamp_model.pt
29
- ```
30
-
31
- ### С указанием выходного файла
32
-
33
- ```bash
34
- python detect.py path/to/image.jpg --output result.jpg
35
- ```
36
-
37
- ### Сохранение JSON с координатами
38
-
39
- ```bash
40
- # Сохранить JSON в output/{имя_файла}_result.json
41
- python detect.py path/to/image.jpg --json
42
-
43
- # Сохранить JSON в указанный файл
44
- python detect.py path/to/image.jpg --json-output results.json
45
- ```
46
-
47
- ## Параметры
48
-
49
- - `image_path` (обязательный) - путь к входному изображению
50
- - `--model` - путь к модели (по умолчанию: `stamp_model.pt`)
51
- - `--output` - путь для сохранения результата (по умолчанию: `output/{имя_файла}_result.jpg`)
52
- - `--conf` - порог уверенности (по умолчанию: 0.25)
53
- - `--json` - сохранить JSON с координатами детекций
54
- - `--json-output` - путь для сохранения JSON файла
55
-
56
- ## Структура
57
-
58
- ```
59
- stamp_detector/
60
- ├── stamp_model.pt # Обученная модель YOLOv8
61
- ├── detect.py # Скрипт детекции
62
- ├── requirements.txt # Зависимости
63
- └── README.md # Документация
64
- ```
65
-
66
- ## Примеры
67
-
68
- ```bash
69
- # Детекция с порогом 0.25
70
- python detect.py image.jpg
71
-
72
- # Более чувствительная детекция (ниже порог)
73
- python detect.py image.jpg --conf 0.15
74
-
75
- # Менее чувствительная детекция (выше порог)
76
- python detect.py image.jpg --conf 0.35
77
-
78
- # Детекция с сохранением JSON координат
79
- python detect.py image.jpg --json
80
- ```
81
-
82
- ## Формат JSON
83
-
84
- При использовании флага `--json` создается JSON файл со следующей структурой:
85
-
86
- ```json
87
- {
88
- "image_path": "output/image_result.jpg",
89
- "image_size": {
90
- "width": 1920,
91
- "height": 1080
92
- },
93
- "detections_count": 2,
94
- "detections": [
95
- {
96
- "class": "stamp",
97
- "confidence": 0.8542,
98
- "bbox": {
99
- "x1": 100,
100
- "y1": 200,
101
- "x2": 300,
102
- "y2": 400,
103
- "width": 200,
104
- "height": 200
105
- },
106
- "bbox_normalized": {
107
- "x1": 0.052083,
108
- "y1": 0.185185,
109
- "x2": 0.15625,
110
- "y2": 0.37037,
111
- "width": 0.104167,
112
- "height": 0.185185
113
- }
114
- }
115
- ]
116
- }
117
- ```
118
-
119
- - `bbox` - абсолютные координаты в пикселях
120
- - `bbox_normalized` - нормализованные координаты (0.0 - 1.0) относительно размера изображения
121
-