lilt-only-base / README.md
RomDev2's picture
Add lilt-only-base
46ecd1e verified
|
Raw
History Blame Contribute Delete
2.24 kB
# lilt-only-base
Layout-only pretrained checkpoint from the official [LiLT repository](https://github.com/jpwang/lilt).
This is **not a complete model** — it contains only the 2D spatial (layout) encoder, with no text encoder. It is intended as a building block for combining with any RoBERTa-like text encoder.
## What is this?
LiLT (Language-Independent Layout Transformer) decouples text and layout understanding into two separate encoders. `lilt-only-base` contains exclusively the **layout encoder** weights, pretrained on document layout understanding (IIT-CDIP dataset).
This allows combining it with any RoBERTa-compatible text encoder to produce a language-specific document understanding model.
## Usage
Use [`gen_weight_roberta_like.py`](https://github.com/jpwang/lilt) from the official repository to combine with your text encoder of choice:
```bash
python gen_weight_roberta_like.py \
--lilt lilt-only-base/pytorch_model.bin \
--text your-roberta-model/pytorch_model.bin \
--config your-roberta-model/config.json \
--out lilt-your-language-base
```
Compatible text encoders: any RoBERTa-like model (`roberta-base`, `camembert-base`, `microsoft/infoxlm-base`, etc.)
## Files
| File | Description |
|------|-------------|
| `model.safetensors` | Layout encoder weights (safetensors format) |
| `pytorch_model.bin` | Layout encoder weights (PyTorch format) |
| `config.json` | Model configuration (`model_type: liltrobertalike`) |
## Note on model type
This checkpoint uses `model_type = liltrobertalike`, a custom type defined in the original LiLT repository. It cannot be loaded directly with `AutoModel` from HuggingFace transformers without first combining it with a text encoder via the procedure above.
## License
MIT — following the original [jpwang/lilt](https://github.com/jpwang/lilt) repository.
## Acknowledgements
- [LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding](https://arxiv.org/abs/2202.13669) — Wang et al., 2022
- Original weights: [jpwang/lilt](https://github.com/jpwang/lilt)
> **Note**: This is not an official HuggingFace release from the original authors.