Spaces:

pengyuan
/

DocAI

Running

App Files Files Community

DocAI / README.md

Pengyuan Li

Fix PyMuPDF build failure on Python 3.13 (HF Spaces)

5cf1d31 6 days ago

preview code

raw

history blame contribute delete

2.93 kB

A newer version of the Gradio SDK is available: 6.9.0

Upgrade

metadata

title: DocAI
emoji: 📊
colorFrom: blue
colorTo: gray
sdk: gradio
sdk_version: 6.6.0
app_file: app.py
pinned: false
license: apache-2.0

📊 DocAI - Document Intelligence Demo

A Hugging Face Space showcasing document parsing and chart intelligence using Docling, Granite Vision, and Chart2CSV.

🎯 Features

📄 Document Parsing - Extract text & figures from PDFs using Docling
🖼️ Figure Detection - Automatically identify and extract figures from documents
💬 Chart Q&A - Ask questions about charts using Granite Vision
📊 Chart-to-CSV - Convert charts to structured data (CSV tables)

🔄 Pipeline

Upload PDF
    ↓
Parse with Docling (extract text + figures)
    ↓
Select a figure/chart
    ↓
Ask questions OR Extract to CSV

🚀 Getting Started

Local Testing

# Install dependencies
pip install -r requirements.txt

# Run app
python app.py

Then open http://localhost:7860

Deploy to HF Spaces

# Commit and push (already set up as HF Space)
git add -A
git commit -m "Update DocAI demo"
git push

📦 Architecture

DocAI/
├── app.py                    # Main Gradio interface
├── requirements.txt          # Dependencies
├── src/
│   ├── ui_state.py          # State management & caching
│   ├── pdf_io.py            # PDF rendering (PyMuPDF)
│   ├── docling_parse.py     # Document parsing
│   ├── crops.py             # Figure extraction
│   ├── infer_vision_qa.py   # Granite Vision Q&A
│   ├── infer_chart2csv.py   # Chart2CSV extraction
│   └── utils.py             # Utilities
└── README.md

🤖 Models Used

Docling: ibm-granite/granite-docling-258M - Document understanding
Granite Vision: ibm-granite/granite-vision-3.3-2b - Image Q&A
Chart2CSV: ibm-granite/granite-vision-3.3-2b-chart2csv-preview - Table extraction

⚙️ Configuration

HuggingFace Hub

The space configuration is defined in README.md (YAML frontmatter):

SDK: Gradio 6.6.0
App File: app.py
License: Apache 2.0

Environment

No secrets required. All models are public on Hugging Face Hub.

📝 Usage Examples

Chart Q&A

"What is the trend in this chart?" → Granite Vision analyzes and responds

Chart2CSV

Upload a bar chart → Get structured CSV data with axis labels and values

⚠️ Known Limitations

PDF parsing depends on Docling model accuracy
Figure detection may miss complex multi-panel layouts
Q&A quality depends on image clarity and model capabilities
Large PDFs (100+ pages) may be slow on CPU

📄 License

Apache 2.0 - See LICENSE file

🙏 Credits

Built with:

Docling - IBM
Granite - IBM AI Research
Gradio - Hugging Face