invoice-processor / README.md
JoseAndresLopez's picture
Upload README.md with huggingface_hub
2313d99 verified
|
Raw
History Blame Contribute Delete
2.06 kB
---
title: Invoice Processor
emoji: 🧾
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 8501
pinned: false
short_description: AI-powered PDF invoice data extraction and dashboard
---
# Invoice Processor — AI-Powered PDF Data Extraction
An intelligent invoice processing tool that extracts structured data from PDF invoices using **Claude AI (Anthropic)** and presents the results in an interactive dashboard. Upload one or multiple invoices and instantly get a clean breakdown of vendors, amounts, dates, taxes, and line items — no manual data entry required.
## What it does
The application reads raw PDF invoices and uses a large language model to understand their content regardless of format or layout. It identifies and extracts the key fields from each invoice, normalizes them into a consistent structure, and displays everything in an interactive Streamlit dashboard with charts and export options.
**Extracted fields include:**
- Vendor name and contact details
- Invoice number and date
- Line items with descriptions, quantities, and unit prices
- Subtotal, taxes (VAT/IVA), and total amount
- Payment terms and due date
## Tech stack
| Layer | Technology |
|---|---|
| UI | Streamlit |
| AI extraction | Claude (Anthropic API) |
| PDF parsing | pypdf |
| Data visualization | Plotly |
| Containerization | Docker |
## How to use
1. Upload one or more PDF invoices using the file uploader in the sidebar.
2. Click **Process** to run the AI extraction pipeline.
3. Explore the dashboard — view per-invoice details, compare vendors, and analyze spending trends.
4. Export the extracted data as CSV for further analysis.
## Configuration
Requires an Anthropic API key set as a Secret in the Space settings:
```
ANTHROPIC_API_KEY=sk-ant-... # get yours at https://console.anthropic.com
```
## Architecture
```
PDF Upload
pypdf (text extraction)
Claude AI (structured data extraction)
Normalized JSON schema
Streamlit Dashboard (charts, tables, export)
```