--- language: - es license: apache-2.0 library_name: peft base_model: Qwen/Qwen3-VL-4B-Instruct tags: - invoice-extraction - ocr - spanish - lora - vision - finance pipeline_tag: image-to-text --- # diffu-0.2 — Spanish Invoice Data Extractor (Vision) **diffu-0.2** is a fine-tuned vision-language model for structured data extraction from Spanish invoice images. Built by [V10 Labs](https://v10labs.com), it extracts supplier details, tax IDs, amounts, and dates from invoice photographs and scans. ## Performance | Model | Accuracy | Type | |-------|----------|------| | **diffu-0.2 (this model)** | **93.39%** | Fine-tuned, vision | | diffu-0.1 (V10 Labs) | 92.82% | Fine-tuned, text-only | | Claude Sonnet 4.6 | 61.6% | Generalist, zero-shot | | Qwen3-VL-4B (base) | 54.4% | Generalist, zero-shot | ### Per-Field Accuracy | Field | Accuracy | |-------|----------| | supplier | 92.06% | | supplier_cif | 94.12% | | invoice_number | 91.35% | | date | 95.33% | | subtotal | 92.06% | | tax_total | 89.25% | | total | 92.99% | | doc_type | 100.00% | ## Model Details - **Base model**: Qwen/Qwen3-VL-4B-Instruct - **Method**: LoRA (r=64, alpha=128) - **Target modules**: q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj - **Training**: 2 epochs, LR=1e-4, effective batch size 16 - **Image resolution**: 256-1280 × 28 × 28 pixels - **Adapter size**: 504 MB - **Peak VRAM**: 22.57 GB (training), ~10 GB (inference) - **Parse failures**: 0% ## Output Format ## Usage ## About V10 Labs V10 Labs builds AI-powered financial intelligence for SMBs in Spain. We train purpose-built models that outperform general-purpose LLMs on domain-specific tasks like invoice processing, accounting classification, and financial analysis. [v10labs.com](https://v10labs.com)