Image-Text-to-Text
MLX
Safetensors
English
falcon_ocr
ocr
vision-language
falcon
apple-silicon
custom_code
Eval Results
Instructions to use mlx-community/Falcon-OCR-bf16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/Falcon-OCR-bf16 with MLX:
# Make sure mlx-vlm is installed # pip install --upgrade mlx-vlm from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template from mlx_vlm.utils import load_config # Load the model model, processor = load("mlx-community/Falcon-OCR-bf16") config = load_config("mlx-community/Falcon-OCR-bf16") # Prepare input image = ["http://images.cocodataset.org/val2017/000000039769.jpg"] prompt = "Describe this image." # Apply chat template formatted_prompt = apply_chat_template( processor, config, prompt, num_images=1 ) # Generate output output = generate(model, processor, formatted_prompt, image) print(output) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
File size: 822 Bytes
28ccccd | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | {
"channel_size": 3,
"coord_dec_dim": 8192,
"coord_enc_dim": 512,
"coord_out_dim": 2048,
"coord_token_id": 240,
"dim": 768,
"eos_id": 11,
"ffn_dim": 2304,
"head_dim": 64,
"image_cls_token_id": 244,
"image_reg_1_token_id": 245,
"image_reg_2_token_id": 246,
"image_reg_3_token_id": 247,
"image_reg_4_token_id": 248,
"img_end_id": 230,
"img_id": 227,
"img_row_sep_id": 228,
"img_start_id": 229,
"max_seq_len": 8192,
"n_heads": 16,
"n_kv_heads": 8,
"n_layers": 22,
"norm_eps": 1e-05,
"num_segm_layers": 3,
"perception_heads": false,
"rope_theta": 10000,
"seg_token_id": 262,
"segm_out_dim": 256,
"size_dec_dim": 8192,
"size_enc_dim": 512,
"size_out_dim": 2048,
"size_token_id": 241,
"spatial_patch_size": 16,
"temporal_patch_size": 1,
"vocab_size": 65536
} |