| --- |
| license: mit |
| datasets: |
| - dimtri009/SLANet-1M_dataset |
| language: |
| - en |
| base_model: |
| - PaddlePaddle/SLANet |
| tags: |
| - table-recognition |
| - table-structure-recognition |
| - slanet-1m |
| --- |
| |
|
|
| # SLANet-1M: A Lightweight Model for Table Recognition |
|
|
|  |
|  |
|  |
|
|
| --- |
|
|
| ## 🧾 Overview |
|
|
| **SLANet-1M** is a lightweight convolutional model for **table recognition** designed to extract table structure and cell content from document images efficiently. |
| It is trained on over **one million synthetic and real-world tables** and provides competitive performance compared to transformer-based architectures—while being significantly smaller and faster. |
|
|
| This model was developed as part of a **Master’s thesis** at the University of Florence and the Swiss AI Center (iCoSys, Fribourg), and presented at **SwissText 2025**. |
|
|
| The paper is available [here](https://aclanthology.org/2025.swisstext-1.9/). |
|
|
| --- |
|
|
| ## 🚀 Key Features |
|
|
| - **Lightweight architecture** (≈9.2M parameters) |
| - **Transformer-free design** for CPU-friendly deployment |
| - **Trained on large-scale datasets** (PubTabNet + SynthTabNet) |
| - **Compatible with deployment pipelines** such as the *Core Engine* |
| - **Outputs** table structure in **HTML** format |
|
|
| --- |
|
|
| ## 📦 Model Details |
|
|
| | Property | Description | |
| |-----------|-------------| |
| | **Model Name** | SLANet-1M | |
| | **Architecture** | CNN-based (SLANet variant with depthwise separable convolutions) | |
| | **Parameters** | ~9.2 million | |
| | **Input Size** | 480×480 (RGB) | |
| | **Output Format** | HTML table structure | |
| | **Training Data** | PubTabNet + SynthTabNet (all subsets) | |
| | **Metrics** | S-TEDS: 99.36 on SynthTabNet and 97.36 on PubTabNet | |
|
|
| --- |
|
|
| Please cite us: |
|
|
| ``` |
| @inproceedings{romaric-etal-2025-slanet, |
| title = "{SLAN}et-1{M}: A Lightweight and Efficient Model for Table Recognition with Minimal Computational Cost", |
| author = "Romaric, Nguinwa Mbakop Dimitri and |
| Petrucci, Andrea and |
| Marinai, Simone and |
| Hennebert, Jean", |
| editor = {Gerber, Jonathan and |
| Cieliebak, Mark and |
| Tuggener, Don and |
| H{\"u}rlimann, Manuela}, |
| booktitle = "Proceedings of the 10th edition of the Swiss Text Analytics Conference", |
| month = may, |
| year = "2025", |
| address = "Winterthur, Switzerland", |
| publisher = "Association for Computational Linguistics", |
| url = "https://aclanthology.org/2025.swisstext-1.9/", |
| pages = "89--102" |
| } |
| ``` |