--- title: Invoice Processor emoji: ๐Ÿงพ colorFrom: blue colorTo: indigo sdk: docker app_port: 8501 pinned: false short_description: AI-powered PDF invoice data extraction and dashboard --- # Invoice Processor โ€” AI-Powered PDF Data Extraction An intelligent invoice processing tool that extracts structured data from PDF invoices using **Claude AI (Anthropic)** and presents the results in an interactive dashboard. Upload one or multiple invoices and instantly get a clean breakdown of vendors, amounts, dates, taxes, and line items โ€” no manual data entry required. ## What it does The application reads raw PDF invoices and uses a large language model to understand their content regardless of format or layout. It identifies and extracts the key fields from each invoice, normalizes them into a consistent structure, and displays everything in an interactive Streamlit dashboard with charts and export options. **Extracted fields include:** - Vendor name and contact details - Invoice number and date - Line items with descriptions, quantities, and unit prices - Subtotal, taxes (VAT/IVA), and total amount - Payment terms and due date ## Tech stack | Layer | Technology | |---|---| | UI | Streamlit | | AI extraction | Claude (Anthropic API) | | PDF parsing | pypdf | | Data visualization | Plotly | | Containerization | Docker | ## How to use 1. Upload one or more PDF invoices using the file uploader in the sidebar. 2. Click **Process** to run the AI extraction pipeline. 3. Explore the dashboard โ€” view per-invoice details, compare vendors, and analyze spending trends. 4. Export the extracted data as CSV for further analysis. ## Configuration Requires an Anthropic API key set as a Secret in the Space settings: ``` ANTHROPIC_API_KEY=sk-ant-... # get yours at https://console.anthropic.com ``` ## Architecture ``` PDF Upload โ”‚ โ–ผ pypdf (text extraction) โ”‚ โ–ผ Claude AI (structured data extraction) โ”‚ โ–ผ Normalized JSON schema โ”‚ โ–ผ Streamlit Dashboard (charts, tables, export) ```