--- language: en license: cdla-permissive-2.0 pipeline_tag: image-text-to-text library_name: mlx tags: - mlx - mlx-vlm - idefics3 - quantized - 8-bit base_model: docling-project/CodeFormulaV2 base_model_relation: quantized datasets: - ds4sd/SynthFormulaNet - ds4sd/SynthCodeNet --- # CodeFormulaV2-mlx-q8 8-bit quantized MLX conversion of [docling-project/CodeFormulaV2](https://huggingface.co/docling-project/CodeFormulaV2), produced with `mlx_vlm.convert --quantize --q-bits 8`. The text decoder linear layers are quantized to 8 bits per weight; the vision encoder stays at bf16 (the mlx-vlm convention via `skip_multimodal_module`). Architecture, training data, and intended use are described on the upstream model page. This repo only re-encodes the weights at lower precision; no retraining or modification of behaviour was performed. A bf16 variant is also available: [`mlx-community/CodeFormulaV2-mlx-bf16`](https://huggingface.co/mlx-community/CodeFormulaV2-mlx-bf16). ## Usage ```bash pip install mlx-vlm ``` ```python from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template model, processor = load("mlx-community/CodeFormulaV2-mlx-q8") prompt = apply_chat_template(processor, model.config, "", num_images=1) result = generate( model, processor, prompt=prompt, image="path/to/image.png", temperature=0.0, ) print(result.text) ``` Use `""` as the prompt for a math-expression image, `""` for a code-block image, per the upstream model card. ## License and attribution This is a derivative of [docling-project/CodeFormulaV2](https://huggingface.co/docling-project/CodeFormulaV2), redistributed under the same [Community Data License Agreement – Permissive 2.0 (CDLA-Permissive-2.0)](https://cdla.dev/permissive-2-0/). Please cite the upstream work: ```bibtex @techreport{Docling, author = {Deep Search Team}, month = {8}, title = {{Docling Technical Report}}, url = {https://arxiv.org/abs/2408.09869}, eprint = {2408.09869}, year = {2024} } ```