DocAI / README.md
Pengyuan Li
Fix PyMuPDF build failure on Python 3.13 (HF Spaces)
5cf1d31

A newer version of the Gradio SDK is available: 6.9.0

Upgrade
metadata
title: DocAI
emoji: πŸ“Š
colorFrom: blue
colorTo: gray
sdk: gradio
sdk_version: 6.6.0
app_file: app.py
pinned: false
license: apache-2.0

πŸ“Š DocAI - Document Intelligence Demo

A Hugging Face Space showcasing document parsing and chart intelligence using Docling, Granite Vision, and Chart2CSV.

🎯 Features

  • πŸ“„ Document Parsing - Extract text & figures from PDFs using Docling
  • πŸ–ΌοΈ Figure Detection - Automatically identify and extract figures from documents
  • πŸ’¬ Chart Q&A - Ask questions about charts using Granite Vision
  • πŸ“Š Chart-to-CSV - Convert charts to structured data (CSV tables)

πŸ”„ Pipeline

Upload PDF
    ↓
Parse with Docling (extract text + figures)
    ↓
Select a figure/chart
    ↓
Ask questions OR Extract to CSV

πŸš€ Getting Started

Local Testing

# Install dependencies
pip install -r requirements.txt

# Run app
python app.py

Then open http://localhost:7860

Deploy to HF Spaces

# Commit and push (already set up as HF Space)
git add -A
git commit -m "Update DocAI demo"
git push

πŸ“¦ Architecture

DocAI/
β”œβ”€β”€ app.py                    # Main Gradio interface
β”œβ”€β”€ requirements.txt          # Dependencies
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ ui_state.py          # State management & caching
β”‚   β”œβ”€β”€ pdf_io.py            # PDF rendering (PyMuPDF)
β”‚   β”œβ”€β”€ docling_parse.py     # Document parsing
β”‚   β”œβ”€β”€ crops.py             # Figure extraction
β”‚   β”œβ”€β”€ infer_vision_qa.py   # Granite Vision Q&A
β”‚   β”œβ”€β”€ infer_chart2csv.py   # Chart2CSV extraction
β”‚   └── utils.py             # Utilities
└── README.md

πŸ€– Models Used

  • Docling: ibm-granite/granite-docling-258M - Document understanding
  • Granite Vision: ibm-granite/granite-vision-3.3-2b - Image Q&A
  • Chart2CSV: ibm-granite/granite-vision-3.3-2b-chart2csv-preview - Table extraction

βš™οΈ Configuration

HuggingFace Hub

The space configuration is defined in README.md (YAML frontmatter):

  • SDK: Gradio 6.6.0
  • App File: app.py
  • License: Apache 2.0

Environment

No secrets required. All models are public on Hugging Face Hub.

πŸ“ Usage Examples

Chart Q&A

"What is the trend in this chart?" β†’ Granite Vision analyzes and responds

Chart2CSV

Upload a bar chart β†’ Get structured CSV data with axis labels and values

⚠️ Known Limitations

  • PDF parsing depends on Docling model accuracy
  • Figure detection may miss complex multi-panel layouts
  • Q&A quality depends on image clarity and model capabilities
  • Large PDFs (100+ pages) may be slow on CPU

πŸ“„ License

Apache 2.0 - See LICENSE file

πŸ™ Credits

Built with: