de-Rodrigo
/

llava-merit

Image-Text-to-Text

document-understanding

Model card Files Files and versions

llava-merit / README.md

de-Rodrigo's picture

Update card files

7f01ccf verified 5 months ago

|

history blame contribute delete

1.09 kB

	---
	license: mit
	datasets:
	- de-Rodrigo/merit
	language:
	- en
	- es
	base_model:
	- llava-hf/llava-1.5-7b-hf
	pipeline_tag: image-text-to-text
	---

	# DONUT Merit

	<a href="https://x.com/nearcyan/status/1706914605262684394">
	<div style="text-align: center;">
	<picture>
	<source media="(prefers-color-scheme: dark)" srcset="https://huggingface.co/de-Rodrigo/donut-merit/resolve/main/assets/dragon_huggingface.png">
	<source media="(prefers-color-scheme: light)" srcset="https://huggingface.co/de-Rodrigo/donut-merit/resolve/main/assets/dragon_huggingface.png">
	<img alt="DragonHuggingFace" src="https://huggingface.co/de-Rodrigo/donut-merit/resolve/main/assets/dragon_huggingface.png" style="width: 200px;">
	</picture>
	</div>
	</a>


	## Model Architecture
	This model is based on the Donut architecture and fine-tuned on the Merit dataset for form understanding tasks.

	- Backbone: [Llava](https://huggingface.co/llava-hf/llava-1.5-7b-hf)
	- Training Data: [Merit](https://huggingface.co/datasets/de-Rodrigo/merit)

	## Example Usage

	```python
	WIP
	```
	WIP 🛠️