Image-Text-to-Text
MLX
Safetensors
English
falcon_ocr
ocr
vision-language
falcon
apple-silicon
custom_code
Eval Results
Instructions to use mlx-community/Falcon-OCR-bf16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/Falcon-OCR-bf16 with MLX:
# Make sure mlx-vlm is installed # pip install --upgrade mlx-vlm from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template from mlx_vlm.utils import load_config # Load the model model, processor = load("mlx-community/Falcon-OCR-bf16") config = load_config("mlx-community/Falcon-OCR-bf16") # Prepare input image = ["http://images.cocodataset.org/val2017/000000039769.jpg"] prompt = "Describe this image." # Apply chat template formatted_prompt = apply_chat_template( processor, config, prompt, num_images=1 ) # Generate output output = generate(model, processor, formatted_prompt, image) print(output) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
| license: apache-2.0 | |
| library_name: mlx | |
| pipeline_tag: image-text-to-text | |
| base_model: tiiuae/Falcon-OCR | |
| tags: | |
| - mlx | |
| - ocr | |
| - vision-language | |
| - falcon | |
| - apple-silicon | |
| language: | |
| - en | |
| # Falcon-OCR (MLX, bf16) | |
| MLX-converted weights of | |
| [`tiiuae/Falcon-OCR`](https://huggingface.co/tiiuae/Falcon-OCR) for inference | |
| on Apple Silicon via [`mlx-vlm`](https://github.com/Blaizzy/mlx-vlm). | |
| ## Source | |
| - Base model: [tiiuae/Falcon-OCR](https://huggingface.co/tiiuae/Falcon-OCR) | |
| — Falcon Perception Team, Technology Innovation Institute (TII). | |
| - License: Apache 2.0 (matches the upstream base model). | |
| - Architecture: early-fusion vision-language model, 300M parameters. | |
| ## Conversion details | |
| - Tool: `mlx_vlm.convert` (mlx-vlm 0.4.4). | |
| - Dtype: `bfloat16`. | |
| - Source revision: `3a4d95a8b0008f7430df30a82cf35e6c3b6bcb66`. | |
| - `trust_remote_code=True` — the repository ships a custom | |
| `FalconOCRProcessor` / `FalconOCRForCausalLM` that is loaded via | |
| dynamic module import. | |
| ## Known caveat | |
| `mlx_vlm.convert` raises | |
| `AttributeError: 'FalconOCRProcessor' object has no attribute 'save_pretrained'` | |
| at the very end of the conversion step. The weights and tokenizer are written | |
| successfully before the error — so the artifacts uploaded here are complete | |
| and `mlx_vlm.load(...)` / docling's `MlxVlmEngine` can consume them. | |
| Tracked upstream: <https://github.com/Blaizzy/mlx-vlm/issues>. | |
| ## Usage | |
| ```python | |
| from mlx_vlm import load, generate | |
| model, processor = load("mlx-community/Falcon-OCR-bf16", trust_remote_code=True) | |
| output = generate(model, processor, prompt="", image=["path/to/page.png"]) | |
| print(output) | |
| ``` | |
| ## Attribution | |
| All credit for the underlying model goes to the Falcon Perception Team at TII. | |
| Cite the [model card](https://huggingface.co/tiiuae/Falcon-OCR) for academic | |
| references. | |