| # lilt-only-base | |
| Layout-only pretrained checkpoint from the official [LiLT repository](https://github.com/jpwang/lilt). | |
| This is **not a complete model** — it contains only the 2D spatial (layout) encoder, with no text encoder. It is intended as a building block for combining with any RoBERTa-like text encoder. | |
| ## What is this? | |
| LiLT (Language-Independent Layout Transformer) decouples text and layout understanding into two separate encoders. `lilt-only-base` contains exclusively the **layout encoder** weights, pretrained on document layout understanding (IIT-CDIP dataset). | |
| This allows combining it with any RoBERTa-compatible text encoder to produce a language-specific document understanding model. | |
| ## Usage | |
| Use [`gen_weight_roberta_like.py`](https://github.com/jpwang/lilt) from the official repository to combine with your text encoder of choice: | |
| ```bash | |
| python gen_weight_roberta_like.py \ | |
| --lilt lilt-only-base/pytorch_model.bin \ | |
| --text your-roberta-model/pytorch_model.bin \ | |
| --config your-roberta-model/config.json \ | |
| --out lilt-your-language-base | |
| ``` | |
| Compatible text encoders: any RoBERTa-like model (`roberta-base`, `camembert-base`, `microsoft/infoxlm-base`, etc.) | |
| ## Files | |
| | File | Description | | |
| |------|-------------| | |
| | `model.safetensors` | Layout encoder weights (safetensors format) | | |
| | `pytorch_model.bin` | Layout encoder weights (PyTorch format) | | |
| | `config.json` | Model configuration (`model_type: liltrobertalike`) | | |
| ## Note on model type | |
| This checkpoint uses `model_type = liltrobertalike`, a custom type defined in the original LiLT repository. It cannot be loaded directly with `AutoModel` from HuggingFace transformers without first combining it with a text encoder via the procedure above. | |
| ## License | |
| MIT — following the original [jpwang/lilt](https://github.com/jpwang/lilt) repository. | |
| ## Acknowledgements | |
| - [LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding](https://arxiv.org/abs/2202.13669) — Wang et al., 2022 | |
| - Original weights: [jpwang/lilt](https://github.com/jpwang/lilt) | |
| > **Note**: This is not an official HuggingFace release from the original authors. |