Spaces:

DocUA
/

LightOnOCR-1B-Demo

Running on Zero

App Files Files Community

LightOnOCR-1B-Demo / docs /gguf_setup.md

DocUA

feat: update ggml kernels, webui components, model templates, and build configurations

eb133b8 4 months ago

preview code

raw

history blame contribute delete

1.33 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

GGUF Backend Setup Guide

Quick Start (Recommended)

Since llama-cpp-python doesn't yet support LightOnOCR, we must build llama.cpp locally.

1. Build llama.cpp locally

# Clone repository
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp

# Create build directory
mkdir build && cd build

# Build with Metal support (MacOS)
cmake .. -DGGML_METAL=ON
cmake --build . --config Release -j 8

# Verify build
./bin/llama-mtmd-cli --help

2. Download GGUF Model

# Return to project root
cd ../../

# Run download script
python download_gguf_model.py

3. Use GGUF Backend

# CLI
python ocr_cli.py document.pdf --backend gguf

# Gradio UI
python app.py
# Select "gguf" from backend dropdown

Performance

The custom built llama-mtmd-cli provides incredible performance on Apple Silicon:

Backend	Time per Page	Speedup
PyTorch (Original)	~4 mins	1x
PyTorch (Optimized)	~40 sec	6x
GGUF (llama-mtmd-cli)	~3 sec	80x ⭐

Troubleshooting

"llama-mtmd-cli binary not found"

Ensure you successfully built llama.cpp and the binary exists at llama.cpp/build/bin/llama-mtmd-cli.

"GGUF model not found"

Run python download_gguf_model.py to download the required model files.