--- title: DocAI emoji: 📊 colorFrom: blue colorTo: gray sdk: gradio sdk_version: 6.6.0 app_file: app.py pinned: false license: apache-2.0 --- # 📊 DocAI - Document Intelligence Demo A Hugging Face Space showcasing document parsing and chart intelligence using **Docling**, **Granite Vision**, and **Chart2CSV**. ## 🎯 Features - **📄 Document Parsing** - Extract text & figures from PDFs using Docling - **🖼️ Figure Detection** - Automatically identify and extract figures from documents - **💬 Chart Q&A** - Ask questions about charts using Granite Vision - **📊 Chart-to-CSV** - Convert charts to structured data (CSV tables) ## 🔄 Pipeline ``` Upload PDF ↓ Parse with Docling (extract text + figures) ↓ Select a figure/chart ↓ Ask questions OR Extract to CSV ``` ## 🚀 Getting Started ### Local Testing ```bash # Install dependencies pip install -r requirements.txt # Run app python app.py ``` Then open http://localhost:7860 ### Deploy to HF Spaces ```bash # Commit and push (already set up as HF Space) git add -A git commit -m "Update DocAI demo" git push ``` ## 📦 Architecture ``` DocAI/ ├── app.py # Main Gradio interface ├── requirements.txt # Dependencies ├── src/ │ ├── ui_state.py # State management & caching │ ├── pdf_io.py # PDF rendering (PyMuPDF) │ ├── docling_parse.py # Document parsing │ ├── crops.py # Figure extraction │ ├── infer_vision_qa.py # Granite Vision Q&A │ ├── infer_chart2csv.py # Chart2CSV extraction │ └── utils.py # Utilities └── README.md ``` ## 🤖 Models Used - **Docling**: `ibm-granite/granite-docling-258M` - Document understanding - **Granite Vision**: `ibm-granite/granite-vision-3.3-2b` - Image Q&A - **Chart2CSV**: `ibm-granite/granite-vision-3.3-2b-chart2csv-preview` - Table extraction ## ⚙️ Configuration ### HuggingFace Hub The space configuration is defined in `README.md` (YAML frontmatter): - **SDK**: Gradio 6.6.0 - **App File**: `app.py` - **License**: Apache 2.0 ### Environment No secrets required. All models are public on Hugging Face Hub. ## 📝 Usage Examples ### Chart Q&A "What is the trend in this chart?" → Granite Vision analyzes and responds ### Chart2CSV Upload a bar chart → Get structured CSV data with axis labels and values ## ⚠️ Known Limitations - PDF parsing depends on Docling model accuracy - Figure detection may miss complex multi-panel layouts - Q&A quality depends on image clarity and model capabilities - Large PDFs (100+ pages) may be slow on CPU ## 📄 License Apache 2.0 - See LICENSE file ## 🙏 Credits Built with: - [Docling](https://github.com/DS4SD/docling) - IBM - [Granite](https://huggingface.co/ibm-granite) - IBM AI Research - [Gradio](https://gradio.app/) - Hugging Face