OCR-free Document Understanding Transformer
Paper
•
2111.15664
•
Published
•
6
This model is fine-tuned from Donut for extracting information from Indonesian ID Cards (KTP).
The model can extract the following fields from KTP:
import torch
from PIL import Image
from donut import DonutModel
# Load model
model = DonutModel.from_pretrained("ahmadarif019/donut-ktp-extractor")
model.eval()
# Use GPU if available
if torch.cuda.is_available():
model.half()
model.to("cuda")
# Process image
image = Image.open("ktp.jpg").convert("RGB")
result = model.inference(image=image, prompt="<s_dataset_ktp>")
ktp_data = result["predictions"][0]
print(ktp_data)
@article{kim2021donut,
title={OCR-free Document Understanding Transformer},
author={Kim, Geewook and Hong, Teakgyu and Yim, Moonbin and Nam, JeongYeon and Park, Jinyoung and Yim, Jinyeong and Hwang, Wonseok and Yun, Sangdoo and Han, Dongyoon and Park, Seunghyun},
journal={arXiv preprint arXiv:2111.15664},
year={2021}
}
MIT License - For research and internal use only.