protonx-models
/

protonx-table-detector

table-recognition

Model card Files Files and versions

xet

Community

hoangth2k4 commited on 29 days ago

Commit

f18ee32

verified ·

1 Parent(s): 0f6ea68

Update README.md

Browse files

Files changed (1) hide show

README.md +83 -3

README.md CHANGED Viewed

@@ -1,5 +1,79 @@
 ## **Quick Usage**
 ```python
 import torch
 import torch.nn as nn
@@ -10,9 +84,9 @@ from PIL import Image
 from huggingface_hub import hf_hub_download
 class TableDetector:
-    def __init__(self, device: str = 'cpu'):
         self.device = torch.device(device)
-        self.model_path = hf_hub_download(repo_id="protonx-models/protonx-ocr-tool-table-detector", filename="model/table_detector.pth")
         self.model = self.load_model(self.model_path)
         self.model.to(self.device)
         self.model.eval()
@@ -40,9 +114,15 @@ class TableDetector:
         return 'have_table' if preds.item() == 1 else 'no_table'
 if __name__ == "__main__":
-    model = TableDetector(device='cpu')
     prediction = model.predict("images/document_page_01.png")
     print(prediction)
 ```

+<!-- ---
+# <div align="center">
+# <p align="center">
+#     <img src="https://storage.googleapis.com/mle-courses-prod/users/61b6fa1ba83a7e37c8309756/private-files/678dadd0-603b-11ef-b0a7-998b84b38d43-ProtonX_logo_horizontally__1_.png" width="260"/>
+# </p>
+# <h1 align="center">
+# ProtonX OCR tool: Table Detector
+# </h1>
+# </div> -->
+---
+## **Introduction**
+### **ProtonX OCR tool: Table Detector**
+This model is a **binary image classification model** designed to determine **whether an input document image contains at least one table**.
+Built on MobileNetV2 architecture, the model is optimized for **document images and scanned PDFs**, especially **Vietnamese documents**, and is intended to be used as a **fast pre-filtering step** in OCR and document understanding pipelines.
+---
+## **Task Definition**
+**Task**: Binary image classification
+**Objective**: Detect **table presence** in an image
+### **Labels**
+| ID | Label     | Meaning |
+|--|--|--|
+| 0 | `no_table` | Image contains **no tables** |
+| 1 | `table`    | Image contains **one or more tables** |
+> ⚠️ The model detects **presence**, not the number or location of tables.
+---
+## **Training Data**
+The model is trained using a combination of:
+### **DocLayNet Dataset**
+- Public document layout dataset
+- High-quality annotations
+- Diverse document layouts
+### **In-house Labeled Vietnamese Document Dataset**
+- Scanned PDFs from Vietnamese documents
+- Mixed-quality OCR inputs
+- Real-world layouts:
+  - Contracts
+  - Administrative forms
+  - Reports
+  - Tables embedded in text-heavy pages
+This combination improves **generalization** across both clean and noisy document images.
 ## **Quick Usage**
+### Using ProtonX library
+```python
+import os
+import unittest
+import torch
+import torchvision
+from protonx import ProtonX
+client = ProtonX()
+prediction = client.ocr.detect_table(image_path="images/document_page_01.png")
+print(prediction)
+```
+### Using torchvision
 ```python
 import torch
 import torch.nn as nn
 from huggingface_hub import hf_hub_download
 class TableDetector:
+    def __init__(self, model_name: str, device: str = 'cpu'):
         self.device = torch.device(device)
+        self.model_path = hf_hub_download(repo_id=model_name, filename="model/table_detector.pth")
         self.model = self.load_model(self.model_path)
         self.model.to(self.device)
         self.model.eval()
         return 'have_table' if preds.item() == 1 else 'no_table'
 if __name__ == "__main__":
+    model = TableDetector(model_name='protonx-models/table-detector', device='cpu')
     prediction = model.predict("images/document_page_01.png")
     print(prediction)
 ```
+## **Acknowledgments**
+Thanks to:
+* [DocLayNet](https://huggingface.co/datasets/docling-project/DocLayNet)