TinyDoc-VLM LoRA Checkpoint

Fine-tuned document AI. 2.7M trainable params. 15 hours on a Mac. Loss: 43 β†’ 15.

GitHub Base Model HF Space

What is this?

A LoRA adapter for TinyDoc-VLM-256M that fine-tunes the model on document understanding tasks. Only 2.7M params (0.93% of total) are trained, making it efficient to train and deploy.

Quick Start

from tinydoc_vlm import TinyDocVLMForConditionalGeneration, TinyDocVLMProcessor
from peft import PeftModel

# Load base model
model = TinyDocVLMForConditionalGeneration.from_pretrained("eulogik/TinyDoc-VLM-256M")

# Apply LoRA adapter
model = PeftModel.from_pretrained(model, "eulogik/TinyDoc-VLM-LoRA")

# Merge for inference (optional, but faster)
model = model.merge_and_unload()

processor = TinyDocVLMProcessor()

Training Details

Parameter Value
Base model eulogik/TinyDoc-VLM-256M
LoRA rank 16
LoRA alpha 32
Trainable params 2,727,936 (0.93% of total)
Target modules q_proj, v_proj, k_proj, o_proj
Training data 3,000 synthetic documents (6,815 QA pairs)
Training steps 17,000
Best step 14,000 (loss: 15.0)
Final loss 17.2 (from 43.3)
Hardware Apple M4 Mac
Training time 15.1 hours

Training Curve

Step 25:    43.3  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
Step 500:   25.7  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
Step 1000:  20.9  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
Step 5000:  18.6  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
Step 10000: 16.5  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
Step 14000: 15.0  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β˜… Best
Step 17000: 17.2  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ

What This Trains

  • Document OCR β€” Printed text recognition
  • Form field extraction β€” Key-value pair extraction
  • Receipt/invoice parsing β€” Amount, date, vendor extraction
  • Table structure understanding β€” Cell extraction
  • Visual question answering β€” Answer questions about documents

How to Reproduce

# Clone repo
git clone https://github.com/eulogik/TinyDoc-VLM.git
cd TinyDoc-VLM
pip install -e .

# Generate synthetic docs
python data/synthetic/generator.py --num-docs 3000 --output-dir data/synthetic/output

# Train LoRA (17K steps, ~15 hours on M4)
python training/fast_train.py     --manifest data/synthetic/output/manifest.jsonl     --data-root data/synthetic     --steps 17000 --batch-size 1 --grad-accum 4 --device mps

# Or use overnight script
bash training/overnight_train.sh

Next Steps

  • Scale to 10K+ documents
  • Add public benchmarks (DocVQA, FUNSD, CORD)
  • Train for 50K+ steps
  • Export to ONNX

Links

License

Apache 2.0. Same as base model.


Part of the TinyDoc-VLM project.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for eulogik/TinyDoc-VLM-LoRA

Adapter
(1)
this model