Spaces:

bekzhanK1
/

armeta_hackaton

Paused

App Files Files Community

bekzhanK1 commited on 27 days ago

Commit

62d55bf

1 Parent(s): c05c750

add readme

Browse files

Files changed (3) hide show

README.md +168 -19
signature/README.md +0 -118
stamp_detector/README.md +0 -121

README.md CHANGED Viewed

@@ -10,28 +10,59 @@ license: mit
 # Document Processing Pipeline API
-FastAPI service for detecting QR codes, signatures, and stamps in PDF documents.
-## Features
-- **QR Code Detection**: Detects and decodes QR codes in documents
-- **Signature Detection**: Uses YOLOv8s to detect signatures
-- **Stamp Detection**: Uses YOLOv8 to detect stamps/seals
-- **PDF Support**: Processes multi-page PDF documents
-## API Endpoints
-- `POST /process-pdf` - Upload and process PDF file
-- `POST /process-pdf-from-url` - Process PDF from URL (S3 or HTTP/HTTPS)
-- `GET /docs` - Interactive API documentation
-- `GET /health` - Health check
-Visit `/docs` for interactive API documentation.
-## Usage
-### Process PDF via API
 ```bash
 curl -X POST "https://bekzhanK1-armeta-hackaton.hf.space/process-pdf" \
   -F "file=@document.pdf" \
@@ -39,14 +70,132 @@ curl -X POST "https://bekzhanK1-armeta-hackaton.hf.space/process-pdf" \
   -F "stamp_conf=0.25"
 ```
-### Process PDF from URL
 ```bash
-curl -X POST "https://bekzhanK1-armeta-hackaton.hf.space/process-pdf-from-url?pdf_url=https://example.com/document.pdf"
 ```
-## Model Requirements
-- Signature model: Automatically downloaded from Hugging Face
-- Stamp model: Must be uploaded to `stamp_detector/stamp_model.pt` in this repository

 # Document Processing Pipeline API
+A production-ready FastAPI service for automated detection and extraction of QR codes, signatures, and stamps from PDF documents. The pipeline processes multi-page PDFs sequentially through three specialized detection models and returns consolidated JSON results.
+## Overview
+This API provides a unified interface for document analysis, combining multiple computer vision models to extract structured information from PDF documents. It supports concurrent processing of multiple documents and can handle both file uploads and remote PDF URLs.
+## Detection Models
+### 1. QR Code Detection
+- **Method**: OpenCV `QRCodeDetector` (native implementation)
+- **Library**: OpenCV Python (`cv2`)
+- **Approach**: Multi-preprocessing pipeline with adaptive thresholding
+- **Features**:
+  - Detects multiple QR codes per page
+  - Decodes QR code data automatically
+  - Uses CLAHE (Contrast Limited Adaptive Histogram Equalization) for enhanced detection
+  - Tests multiple preprocessing approaches (grayscale, binary, Otsu thresholding, inverted)
+- **Output**: Bounding box coordinates, decoded data, corner points
+### 2. Signature Detection
+- **Model**: YOLOv8s (Small variant)
+- **Source**: `tech4humans/yolov8s-signature-detector` (Hugging Face Hub)
+- **Framework**: Ultralytics YOLO
+- **Architecture**: YOLOv8s - optimized for speed and accuracy balance
+- **Access**: Gated model (requires Hugging Face authentication token)
+- **Features**:
+  - Real-time signature detection
+  - Confidence scoring for each detection
+  - Bounding box coordinates with normalized values
+- **Output**: Signature locations, confidence scores, bounding boxes
+### 3. Stamp Detection
+- **Model**: Custom YOLOv8 model
+- **Framework**: Ultralytics YOLO
+- **Model File**: `stamp_model.pt` (custom trained)
+- **Default Confidence Threshold**: 0.25
+- **Features**:
+  - Detects stamps and seals on documents
+  - Configurable confidence threshold
+  - Supports custom model paths
+- **Output**: Stamp locations, confidence scores, bounding boxes
+## API Endpoints
+### `POST /process-pdf`
+Upload and process a PDF file directly.
+**Parameters**:
+- `file` (multipart/form-data): PDF file to process
+- `dpi` (int, default: 200): Resolution for PDF to image conversion
+- `stamp_conf` (float, default: 0.25): Confidence threshold for stamp detection
+**Example**:
 ```bash
 curl -X POST "https://bekzhanK1-armeta-hackaton.hf.space/process-pdf" \
   -F "file=@document.pdf" \
   -F "stamp_conf=0.25"
 ```
+### `POST /process-pdf-advanced`
+Process PDF with advanced options including custom model paths.
+**Parameters**:
+- `file` (multipart/form-data): PDF file to process
+- `dpi` (int, default: 200): Resolution for PDF to image conversion
+- `stamp_conf` (float, default: 0.25): Confidence threshold for stamp detection
+- `stamp_model` (str, optional): Path to custom stamp model
+### `POST /process-pdf-from-url`
+Process PDF from a remote URL (S3, HTTP, or HTTPS).
+**Parameters**:
+- `pdf_url` (query string): URL to PDF file
+- `dpi` (int, default: 200): Resolution for PDF to image conversion
+- `stamp_conf` (float, default: 0.25): Confidence threshold for stamp detection
+- `stamp_model` (str, optional): Path to custom stamp model
+**Example**:
 ```bash
+curl -X POST "https://bekzhanK1-armeta-hackaton.hf.space/process-pdf-from-url?pdf_url=https://example.com/document.pdf&dpi=200"
+```
+### `GET /health`
+Health check endpoint.
+### `GET /docs`
+Interactive API documentation (Swagger UI).
+## Response Format
+The API returns a JSON object with the following structure:
+```json
+{
+  "pdf_file": "document.pdf",
+  "total_pages": 1,
+  "summary": {
+    "total_pages": 1,
+    "total_qr_codes": 2,
+    "total_signatures": 1,
+    "total_stamps": 1,
+    "total_detections": 4
+  },
+  "pages": [
+    {
+      "page_number": 1,
+      "image": "document_page_1.jpg",
+      "image_dimensions": {
+        "width": 1654,
+        "height": 2339
+      },
+      "qr_codes": [
+        {
+          "id": 1,
+          "x": 100,
+          "y": 200,
+          "width": 150,
+          "height": 150,
+          "data": "https://example.com"
+        }
+      ],
+      "signatures": [
+        {
+          "id": 1,
+          "confidence": 0.95,
+          "bbox": {
+            "x1": 500,
+            "y1": 800,
+            "x2": 700,
+            "y2": 900
+          }
+        }
+      ],
+      "stamps": [
+        {
+          "id": 1,
+          "confidence": 0.87,
+          "bbox": {
+            "x1": 1200,
+            "y1": 100,
+            "x2": 1400,
+            "y2": 300
+          }
+        }
+      ]
+    }
+  ]
+}
 ```
+## Configuration
+### DPI Settings
+The DPI parameter controls the resolution when converting PDF pages to images:
+- **150 DPI**: Fast processing, suitable for documents with large elements
+- **200 DPI** (default): Balanced speed and accuracy
+- **300 DPI**: Higher accuracy for small signatures/stamps, slower processing
+**Impact on Detection**:
+- **QR Codes**: Moderate impact - very low DPI may miss small QR codes
+- **Signatures**: High impact - small signatures require higher DPI (200-300)
+- **Stamps**: High impact - small stamps require higher DPI (200-300)
+### Model Requirements
+1. **Signature Model**: Automatically downloaded from Hugging Face Hub on first use
+   - Requires `HF_TOKEN` environment variable for gated model access
+   - Set in Space Settings → Secrets
+2. **Stamp Model**: Must be uploaded to `stamp_detector/stamp_model.pt`
+   - Upload via Hugging Face Space web interface or Git LFS
+## Performance
+- **Concurrent Processing**: Supports up to 4 parallel requests (configurable)
+- **Processing Time**: Varies by document size and DPI (typically 2-10 seconds per page)
+- **Memory**: Optimized for efficient model loading and image processing
+## Deployment
+This API is containerized using Docker and can be deployed on:
+- Hugging Face Spaces (current deployment)
+- Any Docker-compatible platform
+- Local development with GPU support
+## License
+MIT License

signature/README.md DELETED Viewed

@@ -1,118 +0,0 @@
-# YOLOv8 Signature Detector
-This repository implements signature detection using the YOLOv8s model from [tech4humans/yolov8s-signature-detector](https://huggingface.co/tech4humans/yolov8s-signature-detector).
-## Setup
-Install dependencies:
-```bash
-pip install -r requirements.txt
-```
-### Authentication
-The model repository is gated and requires Hugging Face authentication. You need to:
-1. **Login via CLI** (recommended):
-   ```bash
-   huggingface-cli login
-   ```
-   Enter your Hugging Face token when prompted. Get your token from [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)
-2. **Or set environment variable**:
-   ```bash
-   export HF_TOKEN=your_token_here
-   ```
-3. **Or manually download the model**:
-   ```bash
-   huggingface-cli download tech4humans/yolov8s-signature-detector yolov8s.pt
-   ```
-   Then place `yolov8s.pt` in the project root directory.
-## Usage
-### Python Script
-Process all images in the `inputs/` directory:
-```bash
-python inference.py
-```
-The script will:
-1. Check for a local `yolov8s.pt` file first
-2. If not found, download the model from Hugging Face (requires authentication)
-3. Process all images in the `inputs/` directory
-4. Save annotated images with detected signatures to the `outputs/` directory
-5. **Save signature coordinates to `outputs/signature_coordinates.json`**
-6. **Crop and save individual signatures to `outputs/signatures/` directory**
-### CLI (Alternative)
-You can also use the Ultralytics CLI:
-```bash
-huggingface-cli download tech4humans/yolov8s-signature-detector yolov8s.pt
-yolo predict model=yolov8s.pt source=inputs/
-```
-## Model Formats
-The model is available in multiple formats:
-- `yolov8s.pt` (PyTorch format) - used by default
-- `yolov8s.onnx` (ONNX format) - for ONNX Runtime
-- `yolov8s.engine` (TensorRT format) - for TensorRT inference
-## Output
-The script generates several outputs:
-1. **Annotated images**: Images with bounding boxes around detected signatures saved to `outputs/` with the prefix `detected_`
-2. **Signature coordinates JSON**: All detection coordinates saved to `outputs/signature_coordinates.json` with the following structure:
-   ```json
-   [
-     {
-       "image": "image1.jpg",
-       "image_width": 1920,
-       "image_height": 1080,
-       "signatures": [
-         {
-           "signature_id": 1,
-           "confidence": 0.95,
-           "bbox": {
-             "x1": 100.5,
-             "y1": 200.3,
-             "x2": 300.7,
-             "y2": 400.9,
-             "width": 200.2,
-             "height": 200.6
-           },
-           "class_id": 0,
-           "cropped_path": "outputs/signatures/image1_signature_1.jpg"
-         }
-       ]
-     }
-   ]
-   ```
-   The `image_width` and `image_height` fields allow the frontend to properly scale coordinates when displaying images at different sizes. Coordinates are in pixels relative to the original image dimensions.
-3. **Cropped signatures**: Individual signature images saved to `outputs/signatures/` directory
-## Extracting Signatures from Coordinates
-If you need to re-extract signatures using the saved coordinates, use the helper script:
-```bash
-python extract_signatures.py
-```
-Or specify a custom JSON file:
-```bash
-python extract_signatures.py outputs/signature_coordinates.json
-```
-This is useful if you want to extract signatures again without running inference, or if you need to adjust the extraction parameters.

stamp_detector/README.md DELETED Viewed

@@ -1,121 +0,0 @@
-# Stamp Detector
-Простой инструмент для детекции печатей (stamp) на изображениях с использованием YOLOv8.
-## Установка
-```bash
-pip install -r requirements.txt
-```
-## Использование
-### Базовое использование
-```bash
-python detect.py path/to/image.jpg
-```
-### С кастомным порогом уверенности
-```bash
-python detect.py path/to/image.jpg --conf 0.20
-```
-### С указанием пути к модели
-```bash
-python detect.py path/to/image.jpg --model stamp_model.pt
-```
-### С указанием выходного файла
-```bash
-python detect.py path/to/image.jpg --output result.jpg
-```
-### Сохранение JSON с координатами
-```bash
-# Сохранить JSON в output/{имя_файла}_result.json
-python detect.py path/to/image.jpg --json
-# Сохранить JSON в указанный файл
-python detect.py path/to/image.jpg --json-output results.json
-```
-## Параметры
-- `image_path` (обязательный) - путь к входному изображению
-- `--model` - путь к модели (по умолчанию: `stamp_model.pt`)
-- `--output` - путь для сохранения результата (по умолчанию: `output/{имя_файла}_result.jpg`)
-- `--conf` - порог уверенности (по умолчанию: 0.25)
-- `--json` - сохранить JSON с координатами детекций
-- `--json-output` - путь для сохранения JSON файла
-## Структура
-```
-stamp_detector/
-├── stamp_model.pt      # Обученная модель YOLOv8
-├── detect.py           # Скрипт детекции
-├── requirements.txt    # Зависимости
-└── README.md          # Документация
-```
-## Примеры
-```bash
-# Детекция с порогом 0.25
-python detect.py image.jpg
-# Более чувствительная детекция (ниже порог)
-python detect.py image.jpg --conf 0.15
-# Менее чувствительная детекция (выше порог)
-python detect.py image.jpg --conf 0.35
-# Детекция с сохранением JSON координат
-python detect.py image.jpg --json
-```
-## Формат JSON
-При использовании флага `--json` создается JSON файл со следующей структурой:
-```json
-{
-  "image_path": "output/image_result.jpg",
-  "image_size": {
-    "width": 1920,
-    "height": 1080
-  },
-  "detections_count": 2,
-  "detections": [
-    {
-      "class": "stamp",
-      "confidence": 0.8542,
-      "bbox": {
-        "x1": 100,
-        "y1": 200,
-        "x2": 300,
-        "y2": 400,
-        "width": 200,
-        "height": 200
-      },
-      "bbox_normalized": {
-        "x1": 0.052083,
-        "y1": 0.185185,
-        "x2": 0.15625,
-        "y2": 0.37037,
-        "width": 0.104167,
-        "height": 0.185185
-      }
-    }
-  ]
-}
-```
-- `bbox` - абсолютные координаты в пикселях
-- `bbox_normalized` - нормализованные координаты (0.0 - 1.0) относительно размера изображения