Add pipeline tag, library name, paper link, and basic description

f002b44 verified 12 months ago

975 Bytes

	---
	license: cc-by-nc-sa-4.0
	pipeline_tag: image-text-to-text
	library_name: transformers
	---

	# DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding

	This model is presented in the paper [DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding](https://huggingface.co/papers/2408.15045). DocLayLLM is designed for text-rich document understanding, integrating visual patch tokens and 2D positional tokens into LLMs to enhance their document comprehension and OCR information perception.

	## How to Use

	A more complete usage example will be added when available. For now, a basic example:

	```python
	from transformers import pipeline

	pipe = pipeline("text-generation", model="your_model_id") # replace your_model_id
	result = pipe("Your input text here.")
	print(result)
	```

	Replace `"your_model_id"` with the actual Hugging Face model ID.