--- license: apache-2.0 library_name: mlx pipeline_tag: image-text-to-text base_model: tiiuae/Falcon-OCR tags: - mlx - ocr - vision-language - falcon - apple-silicon language: - en --- # Falcon-OCR (MLX, bf16) MLX-converted weights of [`tiiuae/Falcon-OCR`](https://huggingface.co/tiiuae/Falcon-OCR) for inference on Apple Silicon via [`mlx-vlm`](https://github.com/Blaizzy/mlx-vlm). ## Source - Base model: [tiiuae/Falcon-OCR](https://huggingface.co/tiiuae/Falcon-OCR) — Falcon Perception Team, Technology Innovation Institute (TII). - License: Apache 2.0 (matches the upstream base model). - Architecture: early-fusion vision-language model, 300M parameters. ## Conversion details - Tool: `mlx_vlm.convert` (mlx-vlm 0.4.4). - Dtype: `bfloat16`. - Source revision: `3a4d95a8b0008f7430df30a82cf35e6c3b6bcb66`. - `trust_remote_code=True` — the repository ships a custom `FalconOCRProcessor` / `FalconOCRForCausalLM` that is loaded via dynamic module import. ## Known caveat `mlx_vlm.convert` raises `AttributeError: 'FalconOCRProcessor' object has no attribute 'save_pretrained'` at the very end of the conversion step. The weights and tokenizer are written successfully before the error — so the artifacts uploaded here are complete and `mlx_vlm.load(...)` / docling's `MlxVlmEngine` can consume them. Tracked upstream: . ## Usage ```python from mlx_vlm import load, generate model, processor = load("mlx-community/Falcon-OCR-bf16", trust_remote_code=True) output = generate(model, processor, prompt="", image=["path/to/page.png"]) print(output) ``` ## Attribution All credit for the underlying model goes to the Falcon Perception Team at TII. Cite the [model card](https://huggingface.co/tiiuae/Falcon-OCR) for academic references.