Upload 8-bit MLX quantization of datalab-to/chandra-ocr-2

1303930 verified about 2 months ago

3.78 kB

	---
	library_name: mlx
	license: other
	license_name: modified-openrail-m
	license_link: LICENSE
	tags:
	- ocr
	- pdf
	- markdown
	- layout
	- mlx
	- 8bit
	- quantized
	pipeline_tag: image-text-to-text
	base_model: datalab-to/chandra-ocr-2
	---

	# Chandra OCR 2 — 8-bit MLX Quantization

	This is an 8-bit MLX quantization of [datalab-to/chandra-ocr-2](https://huggingface.co/datalab-to/chandra-ocr-2), converted for efficient inference on Apple Silicon using the [mlx-vlm](https://github.com/Blaizzy/mlx-vlm) framework.

	Original model: [datalab-to/chandra-ocr-2](https://huggingface.co/datalab-to/chandra-ocr-2)
	Quantization: 8-bit affine, group size 64
	Framework: MLX (Apple Silicon)
	Modified files: The weight file (`model.safetensors`) has been quantized from the original bfloat16 weights. All other files are unchanged from the original repository.

	## About Chandra OCR 2

	Chandra 2 is a state-of-the-art OCR model from [Datalab](https://www.datalab.to) that outputs markdown, HTML, and JSON. It is highly accurate at extracting text from images and PDFs while preserving layout information.

	### What's New in Chandra 2

	- 85.9% olmocr bench score (SOTA), 77.8% multilingual bench score (12% improvement over Chandra 1)
	- Significant improvements to math, tables, and complex layouts
	- Improved layout, especially on wider documents
	- Significantly better image captioning
	- 90+ language support with major accuracy gains

	### Features

	- Convert documents to markdown, HTML, or JSON with detailed layout information
	- Excellent handwriting support
	- Reconstructs forms accurately, including checkboxes
	- Strong performance with tables, math, and complex layouts
	- Extracts images and diagrams with captions and structured data
	- Support for 90+ languages

	## Usage with mlx-vlm

	### Installation

	```bash
	pip install mlx-vlm
	```

	### Inference

	```python
	from mlx_vlm import load
	from mlx_vlm.utils import generate_step
	from PIL import Image

	model, processor = load("jacobwindle/chandra-ocr-2-8bit-mlx")

	image = Image.open("document.png")
	prompt = "Convert this image to markdown."

	output = generate_step(
	model=model,
	processor=processor,
	image=image,
	prompt=prompt,
	max_tokens=4096,
	)
	print(output)
	```

	### Command-line

	```bash
	python -m mlx_vlm.generate --model jacobwindle/chandra-ocr-2-8bit-mlx --image document.png --prompt "Convert this image to markdown." --max-tokens 4096
	```

	## Quantization Details

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Bits \| 8 \|
	\| Group size \| 64 \|
	\| Mode \| Affine \|
	\| Original dtype \| bfloat16 \|
	\| Quantized size \| ~4.8 GB \|

	Converted using:

	```bash
	python -m mlx_vlm.convert --model datalab-to/chandra-ocr-2 --mlx-path models/chandra-ocr-2-8bit -q --q-bits 8
	```

	## Attribution

	This is a derivative work of [datalab-to/chandra-ocr-2](https://huggingface.co/datalab-to/chandra-ocr-2). The original model was created by [Datalab](https://www.datalab.to). The weights in this repository have been modified (8-bit quantized) from the original release. All credit for the model architecture, training data, and original weights belongs to the original authors.

	## License

	This model inherits the modified OpenRAIL-M license from the original [datalab-to/chandra-ocr-2](https://huggingface.co/datalab-to/chandra-ocr-2). As a derivative work, the same license terms apply, including the share-alike requirement (Section III, paragraph 8) and use-based restrictions (Attachment A).

	Key restrictions from the original license:
	- Free for research, personal use, and startups under $2M funding/revenue
	- Cannot be used competitively with the Datalab API
	- Derivative works must retain the same license

	For broader commercial licensing, see [Datalab pricing](https://www.datalab.to/pricing).