nexaml commited on
Commit
4054da8
·
verified ·
1 Parent(s): 0b3a4b6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -0
README.md ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Table-Transformer-Detection
2
+
3
+ ## Model Description
4
+ **Table-Transformer-Detection** is a 28.8-million-parameter object detection model from Microsoft Research, fine-tuned specifically for table detection in documents.
5
+ Built on the DETR (DEtection TRansformer) architecture, it locates and identifies tables within unstructured document images such as PDFs and scanned pages.
6
+
7
+ Trained on PubTables-1M — a large-scale dataset containing nearly one million fully annotated tables from scientific articles — Table-Transformer-Detection delivers strong performance for document table extraction without requiring task-specific architectural customization.
8
+
9
+ ## Features
10
+ - **Table detection**: accurately locates tables in document images, PDFs, and scanned pages.
11
+ - **DETR-based architecture**: leverages a Transformer encoder-decoder on top of a CNN backbone (ResNet) for end-to-end object detection.
12
+ - **Pre-normalization**: uses the "normalize before" setting, applying LayerNorm before self- and cross-attention for improved training stability.
13
+ - **Lightweight**: at only 28.8M parameters (F32), the model is efficient to deploy and run inference on.
14
+ - **Fine-tunable**: can be further fine-tuned on domain-specific document datasets for improved accuracy.
15
+
16
+ ## Use Cases
17
+ - Automated document processing and digitization pipelines
18
+ - Table extraction from academic papers and research articles
19
+ - Invoice and financial document parsing
20
+ - Legal and regulatory document analysis
21
+ - Healthcare and clinical report table extraction
22
+ - Preprocessing step for downstream table structure recognition
23
+
24
+ ## Inputs and Outputs
25
+ **Input**:
26
+ - Document images (JPEG, PNG, etc.) containing one or more tables.
27
+
28
+ **Output**:
29
+ - Bounding box predictions with confidence scores for each detected table in the image.
30
+ - Class labels identifying detected objects as tables.
31
+
32
+ ## License
33
+ This repo is licensed under the Creative Commons Attribution–NonCommercial 4.0 (CC BY-NC 4.0) license, which allows use, sharing, and modification only for non-commercial purposes with proper attribution. All NPU-related models, runtimes, and code in this project are protected under this non-commercial license and cannot be used in any commercial or revenue-generating applications. Commercial licensing or enterprise usage requires a separate agreement. For inquiries, please contact `dev@nexa.ai`