Falcon-OCR-bf16 / README.md
geoHeil's picture
Upload MLX-converted Falcon-OCR (bf16)
28ccccd verified
---
license: apache-2.0
library_name: mlx
pipeline_tag: image-text-to-text
base_model: tiiuae/Falcon-OCR
tags:
- mlx
- ocr
- vision-language
- falcon
- apple-silicon
language:
- en
---
# Falcon-OCR (MLX, bf16)
MLX-converted weights of
[`tiiuae/Falcon-OCR`](https://huggingface.co/tiiuae/Falcon-OCR) for inference
on Apple Silicon via [`mlx-vlm`](https://github.com/Blaizzy/mlx-vlm).
## Source
- Base model: [tiiuae/Falcon-OCR](https://huggingface.co/tiiuae/Falcon-OCR)
— Falcon Perception Team, Technology Innovation Institute (TII).
- License: Apache 2.0 (matches the upstream base model).
- Architecture: early-fusion vision-language model, 300M parameters.
## Conversion details
- Tool: `mlx_vlm.convert` (mlx-vlm 0.4.4).
- Dtype: `bfloat16`.
- Source revision: `3a4d95a8b0008f7430df30a82cf35e6c3b6bcb66`.
- `trust_remote_code=True` — the repository ships a custom
`FalconOCRProcessor` / `FalconOCRForCausalLM` that is loaded via
dynamic module import.
## Known caveat
`mlx_vlm.convert` raises
`AttributeError: 'FalconOCRProcessor' object has no attribute 'save_pretrained'`
at the very end of the conversion step. The weights and tokenizer are written
successfully before the error — so the artifacts uploaded here are complete
and `mlx_vlm.load(...)` / docling's `MlxVlmEngine` can consume them.
Tracked upstream: <https://github.com/Blaizzy/mlx-vlm/issues>.
## Usage
```python
from mlx_vlm import load, generate
model, processor = load("mlx-community/Falcon-OCR-bf16", trust_remote_code=True)
output = generate(model, processor, prompt="", image=["path/to/page.png"])
print(output)
```
## Attribution
All credit for the underlying model goes to the Falcon Perception Team at TII.
Cite the [model card](https://huggingface.co/tiiuae/Falcon-OCR) for academic
references.