Instructions to use lucky-verma/driver-license-reader with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lucky-verma/driver-license-reader with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="lucky-verma/driver-license-reader")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("lucky-verma/driver-license-reader") model = AutoModelForMultimodalLM.from_pretrained("lucky-verma/driver-license-reader") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use lucky-verma/driver-license-reader with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "lucky-verma/driver-license-reader" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lucky-verma/driver-license-reader", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/lucky-verma/driver-license-reader
- SGLang
How to use lucky-verma/driver-license-reader with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "lucky-verma/driver-license-reader" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lucky-verma/driver-license-reader", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "lucky-verma/driver-license-reader" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lucky-verma/driver-license-reader", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use lucky-verma/driver-license-reader with Docker Model Runner:
docker model run hf.co/lucky-verma/driver-license-reader
Driver License Reader
Donut-based image-to-JSON extraction for driver's license documents.
driver license OCR alternative | ID card parsing | KYC document extraction | visual document understanding
What it does
This model is a fine-tuned Donut VisionEncoderDecoderModel for extracting structured fields from driver's license images without a separate OCR pipeline. It converts an input image into JSON-like fields such as:
namestatedatedobperson
The model is intended for demos, prototyping, and research around document AI workflows. It is not a production identity-verification system.
Public-safety notes
- Do not upload real driver's licenses unless you have permission and a lawful basis to process the data.
- Prefer synthetic, redacted, or consented images for testing.
- Outputs can be wrong or incomplete. Use human review before relying on extracted identity fields.
- The repository keeps the safer
model.safetensorsweight file and does not require loading Pickle weights.
Quick start
import re
import torch
from PIL import Image
from transformers import DonutProcessor, VisionEncoderDecoderModel
model_id = "lucky-verma/driver-license-reader"
processor = DonutProcessor.from_pretrained(model_id)
model = VisionEncoderDecoderModel.from_pretrained(model_id)
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device).eval()
image = Image.open("redacted_or_synthetic_license.jpg").convert("RGB")
pixel_values = processor(image, return_tensors="pt").pixel_values.to(device)
task_prompt = "<s_cord-v2>"
decoder_input_ids = processor.tokenizer(
task_prompt,
add_special_tokens=False,
return_tensors="pt",
)["input_ids"].to(device)
with torch.inference_mode():
outputs = model.generate(
pixel_values,
decoder_input_ids=decoder_input_ids,
max_length=model.decoder.config.max_position_embeddings,
early_stopping=True,
pad_token_id=processor.tokenizer.pad_token_id,
eos_token_id=processor.tokenizer.eos_token_id,
use_cache=True,
num_beams=1,
bad_words_ids=[[processor.tokenizer.unk_token_id]],
return_dict_in_generate=True,
)
sequence = processor.batch_decode(outputs.sequences)[0]
sequence = sequence.replace(processor.tokenizer.eos_token, "")
sequence = sequence.replace(processor.tokenizer.pad_token, "")
sequence = re.sub(r"<.*?>", "", sequence, count=1).strip()
print(processor.token2json(sequence))
Model details
- Architecture: Donut / vision encoder-decoder
- Base model:
nielsr/donut-base - Format:
model.safetensors - Input: image of a driver's license-like document
- Output: structured JSON-style fields
- Language: English
Limitations
This model was trained on a small driver's-license dataset and may fail on unseen layouts, glare, blur, occlusion, low-resolution scans, non-English documents, or non-US license formats. Treat the output as an extraction suggestion, not a verified identity record.
- Downloads last month
- 28
Model tree for lucky-verma/driver-license-reader
Base model
nielsr/donut-base