RomDev2
/

lilt-only-base

liltrobertalike

Model card Files Files and versions

lilt-only-base / README.md

RomDev2's picture

Add lilt-only-base

46ecd1e verified 17 days ago

|

History Blame Contribute Delete

2.24 kB

	# lilt-only-base

	Layout-only pretrained checkpoint from the official [LiLT repository](https://github.com/jpwang/lilt).

	This is not a complete model — it contains only the 2D spatial (layout) encoder, with no text encoder. It is intended as a building block for combining with any RoBERTa-like text encoder.

	## What is this?

	LiLT (Language-Independent Layout Transformer) decouples text and layout understanding into two separate encoders. `lilt-only-base` contains exclusively the layout encoder weights, pretrained on document layout understanding (IIT-CDIP dataset).

	This allows combining it with any RoBERTa-compatible text encoder to produce a language-specific document understanding model.

	## Usage

	Use [`gen_weight_roberta_like.py`](https://github.com/jpwang/lilt) from the official repository to combine with your text encoder of choice:

	```bash
	python gen_weight_roberta_like.py \
	--lilt lilt-only-base/pytorch_model.bin \
	--text your-roberta-model/pytorch_model.bin \
	--config your-roberta-model/config.json \
	--out lilt-your-language-base
	```

	Compatible text encoders: any RoBERTa-like model (`roberta-base`, `camembert-base`, `microsoft/infoxlm-base`, etc.)

	## Files

	\| File \| Description \|
	\|------\|-------------\|
	\| `model.safetensors` \| Layout encoder weights (safetensors format) \|
	\| `pytorch_model.bin` \| Layout encoder weights (PyTorch format) \|
	\| `config.json` \| Model configuration (`model_type: liltrobertalike`) \|

	## Note on model type

	This checkpoint uses `model_type = liltrobertalike`, a custom type defined in the original LiLT repository. It cannot be loaded directly with `AutoModel` from HuggingFace transformers without first combining it with a text encoder via the procedure above.

	## License

	MIT — following the original [jpwang/lilt](https://github.com/jpwang/lilt) repository.

	## Acknowledgements

	- [LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding](https://arxiv.org/abs/2202.13669) — Wang et al., 2022
	- Original weights: [jpwang/lilt](https://github.com/jpwang/lilt)

	> Note: This is not an official HuggingFace release from the original authors.