TinyDoc-VLM LoRA Checkpoint

Fine-tuned document AI. 2.7M trainable params. 15 hours on a Mac. Loss: 43 → 15.

What is this?

A LoRA adapter for TinyDoc-VLM-256M that fine-tunes the model on document understanding tasks. Only 2.7M params (0.93% of total) are trained, making it efficient to train and deploy.

Quick Start

from tinydoc_vlm import TinyDocVLMForConditionalGeneration, TinyDocVLMProcessor
from peft import PeftModel

# Load base model
model = TinyDocVLMForConditionalGeneration.from_pretrained("eulogik/TinyDoc-VLM-256M")

# Apply LoRA adapter
model = PeftModel.from_pretrained(model, "eulogik/TinyDoc-VLM-LoRA")

# Merge for inference (optional, but faster)
model = model.merge_and_unload()

processor = TinyDocVLMProcessor()

Training Details

Parameter	Value
Base model	eulogik/TinyDoc-VLM-256M
LoRA rank	16
LoRA alpha	32
Trainable params	2,727,936 (0.93% of total)
Target modules	q_proj, v_proj, k_proj, o_proj
Training data	3,000 synthetic documents (6,815 QA pairs)
Training steps	17,000
Best step	14,000 (loss: 15.0)
Final loss	17.2 (from 43.3)
Hardware	Apple M4 Mac
Training time	15.1 hours

Training Curve

Step 25:    43.3  ████████████████████████████████████████████
Step 500:   25.7  ███████████████████████████
Step 1000:  20.9  █████████████████████
Step 5000:  18.6  ██████████████████
Step 10000: 16.5  ████████████████
Step 14000: 15.0  ███████████████ ★ Best
Step 17000: 17.2  █████████████████

What This Trains

Document OCR — Printed text recognition
Form field extraction — Key-value pair extraction
Receipt/invoice parsing — Amount, date, vendor extraction
Table structure understanding — Cell extraction
Visual question answering — Answer questions about documents

How to Reproduce

# Clone repo
git clone https://github.com/eulogik/TinyDoc-VLM.git
cd TinyDoc-VLM
pip install -e .

# Generate synthetic docs
python data/synthetic/generator.py --num-docs 3000 --output-dir data/synthetic/output

# Train LoRA (17K steps, ~15 hours on M4)
python training/fast_train.py     --manifest data/synthetic/output/manifest.jsonl     --data-root data/synthetic     --steps 17000 --batch-size 1 --grad-accum 4 --device mps

# Or use overnight script
bash training/overnight_train.sh

Next Steps

Scale to 10K+ documents
Add public benchmarks (DocVQA, FUNSD, CORD)
Train for 50K+ steps
Export to ONNX

Links

Resource	URL
Base Model	eulogik/TinyDoc-VLM-256M
GitHub	github.com/eulogik/TinyDoc-VLM
Live Demo	huggingface.co/spaces/eulogik/TinyDoc-VLM
Training Script	training/fast_train.py

License

Apache 2.0. Same as base model.

Part of the TinyDoc-VLM project.

Downloads last month: -

Model tree for eulogik/TinyDoc-VLM-LoRA

Base model

eulogik/TinyDoc-VLM-256M

Adapter

(1)

this model