Visual Question Answering
Transformers
ONNX
Safetensors
PyTorch
English
tinydoc_vlm
text-generation
vision-language-model
document-understanding
ocr
vqa
tinyml
siglip
lora
open-source
huggingface
multimodal
document-ai
deep-learning
form-understanding
table-extraction
receipt-ocr
invoice-processing
Instructions to use eulogik/TinyDoc-VLM-256M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use eulogik/TinyDoc-VLM-256M with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("visual-question-answering", model="eulogik/TinyDoc-VLM-256M")# Load model directly from transformers import AutoModelForSeq2SeqLM model = AutoModelForSeq2SeqLM.from_pretrained("eulogik/TinyDoc-VLM-256M", dtype="auto") - Notebooks
- Google Colab
- Kaggle
TinyDoc-VLM-256M
256M-parameter document-specialist Vision-Language Model
Overview
TinyDoc-VLM is a compact vision-language model specialized for document understanding tasks: OCR, form extraction, table parsing, receipt processing, and visual question answering.
- 256M params: SigLIP-B/16 vision encoder (93M) + PixelShuffle 3ร compressor + SmolLM2-135M decoder
- <1GB VRAM: Runs on MacBook, Raspberry Pi 5, or any CPU with ONNX
- Apache 2.0: Fully open-source, free for commercial use
Architecture
Image (384ร384) โ SigLIP-B/16 (93M) โ PixelShuffle 3ร โ 64 tokens โ SmolLM2-135M โ JSON/KV/Table/OCR/QA
Usage
from tinydoc_vlm import TinyDocVLMForConditionalGeneration, TinyDocVLMProcessor
model = TinyDocVLMForConditionalGeneration.from_pretrained(eulogik/TinyDoc-VLM-256M)
processor = TinyDocVLMProcessor()
LoRA Fine-tuning
Train on your own documents with LoRA (2.7M trainable params, 0.93% of total):
# Generate synthetic docs
python data/synthetic/generator.py --num-docs 1000 --output-dir data/synthetic/output
# Train with LoRA
python training/fast_train.py --steps 5000 --device mps # M4 Mac
python training/fast_train.py --steps 5000 --device cuda # GPU
See training/colab_train.ipynb for a complete Colab notebook.
ONNX Export
python export/export_onnx.py --model-path eulogik/TinyDoc-VLM-256M --output model.onnx
Benchmarks
| Benchmark | Status |
|---|---|
| OCRBench | In progress (needs instruction tuning) |
| DocVQA | Pending |
| FUNSD | Pending |
Links
Citation
@software{eulogik_tinydoc_vlm_2026,
author = {eulogik},
title = {TinyDoc-VLM: 256M-Param Document-Specialist VLM},
year = {2026},
url = {https://github.com/eulogik/TinyDoc-VLM}
}
License
Apache 2.0. See LICENSE.
- Downloads last month
- 230