Spaces:

quantumbit
/

invoice_extractor

Paused

App Files Files Community

invoice_extractor / README_git.md

github-actions[bot]

Sync from GitHub: 67c5ee67bf66d7b77be5e2ffbfaa22681c2e0ebf

aeb681f 2 days ago

preview code

raw

history blame contribute delete

3.84 kB

Invoice Information Extractor API

Extract structured information from Indian tractor invoices using AI-powered REST API.

What It Does

Combines YOLO (signature/stamp detection) + Qwen2.5-VL (text extraction) to extract:

Dealer name
Model name
Horse power
Asset cost
Signature presence & location
Stamp presence & location

Architecture

Production (Hugging Face Deployment)

FastAPI server with REST endpoints
Models loaded on startup and cached in memory
YOLO model stored locally in utils/models/best.pt
Qwen2.5-VL downloaded from Hugging Face on first run (not stored locally)

Key Components

app.py - FastAPI server with endpoints
model_manager.py - Handles model loading and caching
inference.py - Processing pipeline and validation
config.py - Configuration settings
executable.py - Legacy CLI interface (deprecated)

Installation

pip install -r requirements.txt

Requirements: Python 3.10+, CUDA GPU (8GB+ VRAM)

Running the Server

Local Development

python app.py

Server runs on http://localhost:7860

Production (Hugging Face Spaces)

uvicorn app:app --host 0.0.0.0 --port 7860

API Endpoints

1. Health Check

GET /health

Response:

{
  "status": "healthy",
  "models_loaded": true
}

2. Extract Single Invoice

POST /extract

Parameters:

file (required): Image file (JPG, PNG, JPEG)
doc_id (optional): Document identifier

Example (cURL):

curl -X POST "http://localhost:7860/extract" \
  -F "file=@invoice_001.png" \
  -F "doc_id=invoice_001"

Response:

{
  "doc_id": "invoice_001",
  "fields": {
    "dealer_name": "ABC Tractors Pvt Ltd",
    "model_name": "Mahindra 575 DI",
    "horse_power": 50,
    "asset_cost": 525000,
    "signature": {"present": true, "bbox": [100, 200, 300, 250]},
    "stamp": {"present": true, "bbox": [400, 500, 500, 550]}
  },
  "confidence": 0.89,
  "processing_time_sec": 3.8,
  "cost_estimate_usd": 0.000528,
  "warnings": null
}

3. Extract Multiple Invoices (Batch)

POST /extract_batch

Parameters:

files (required): Array of image files

Output Format

Results saved to sample_output/result.json:

{
  "doc_id": "invoice_001",
  "fields": {
    "dealer_name": "ABC Tractors Pvt Ltd",
    "model_name": "Mahindra 575 DI",
    "horse_power": 50,
    "asset_cost": 525000,
    "signature": {"present": true, "bbox": [100, 200, 300, 250]},
    "stamp": {"present": true, "bbox": [400, 500, 500, 550]}
  },
  "confidence": 0.89,
  "processing_time_sec": 3.8,
  "cost_estimate_usd": 0.000528
}

Range: 0.0 to 1.0 (higher is better)

Cost Calculation

Formula:

cost_usd = (0.5 * processing_time_sec) / 3600

Assumes $0.60 per GPU hour

Typical costs:

Per invoice: ~$0.002

Models

YOLO: Signature/stamp detection (best.pt)
Qwen2.5-VL-7B: Text extraction (4-bit quantized)

GPU Requirements

Minimum: 10 GB VRAM

Project Structure

INVOICE_INFO_EXTRACTOR/
├── app.py                 # FastAPI server (main entry point)
├── model_manager.py       # Model loading and caching
├── inference.py           # Processing pipeline and validation
├── config.py              # Configuration settings
├── requirements.txt       
├── README.md             
├── executable.py          # Legacy CLI (deprecated)
├── utils/
│   └── models/
│       └── best.pt        # YOLO model (stored locally)
└── sample_output/
    └── result.json        # Sample output

Performance

Processing time: ~8 seconds per invoice
Cost per invoice: ~$0.002 (GPU time)
GPU Memory: 8GB minimum