bazooka / README.md
lakens's picture
add pkl
9356144
---
title: Barzooka (Gradio)
emoji: ๐Ÿ“Š
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.29.0
python_version: 3.11
app_file: app.py
pinned: false
license: mit
---
# Barzooka (Hugging Face Space)
This Space wraps the **Barzooka** tool to screen PDFs or images and detect graph types
(bar, bardot, box, dot, violin, hist, pie, flow, text, other).
- Upstream repo: https://github.com/quest-bih/barzooka
- Dependencies in Spaces:
- Python: pinned in `requirements.txt`
- System: `poppler-utils` via `packages.txt` (provides `pdftocairo`)
## Usage
1. **Single PDF** tab: upload a PDF and choose aggregated vs page-wise output.
2. **ZIP of PDFs** tab: upload a `.zip` with multiple PDFs and download a CSV of results.
3. **Images** tab: upload JPG/PNG images to classify each page image directly.
## Model file (`barzooka.pkl`)
If the installed package did not include the model (Git-LFS), upload `barzooka.pkl` next to `app.py`
or set the environment variable `BARZOOKA_MODEL_URL` to a direct downloadable URL.
## Notes
- Barzooka relies on `pdftocairo` to convert PDF pages to images. This Space installs `poppler-utils` automatically.
- FastAI `.pkl` files are sensitive to library versions; we pin fastai/torch/torchvision accordingly.
- Torch may require NumPy < 2; we pin `numpy==1.26.4` to avoid ABI mismatches.