File size: 3,988 Bytes
492cbf7 aeb681f 492cbf7 aeb681f 492cbf7 aeb681f 492cbf7 aeb681f 492cbf7 aeb681f 492cbf7 aeb681f 492cbf7 aeb681f 492cbf7 aeb681f 492cbf7 aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a 492cbf7 aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a 492cbf7 aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a 492cbf7 aeb681f 060dc2a aeb681f 060dc2a 492cbf7 aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f 060dc2a aeb681f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 |
---
title: Tractor Invoice Information Extractor
emoji: π
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
license: mit
app_port: 7860
---
# Invoice Information Extractor API
Extract structured information from Indian tractor invoices using AI-powered REST API.
## What It Does
Combines **YOLO** (signature/stamp detection) + **Qwen2.5-VL** (text extraction) to extract:
- Dealer name
- Model name
- Horse power
- Asset cost
- Signature presence & location
- Stamp presence & location
## Architecture
### Production (Hugging Face Deployment)
- **FastAPI server** with REST endpoints
- **Models loaded on startup** and cached in memory
- **YOLO model** stored locally in `utils/models/best.pt`
- **Qwen2.5-VL** downloaded from Hugging Face on first run (not stored locally)
### Key Components
- `app.py` - FastAPI server with endpoints
- `model_manager.py` - Handles model loading and caching
- `inference.py` - Processing pipeline and validation
- `config.py` - Configuration settings
- `executable.py` - Legacy CLI interface (deprecated)
## Installation
```bash
pip install -r requirements.txt
```
**Requirements:** Python 3.10+, CUDA GPU (8GB+ VRAM)
## Running the Server
### Local Development
```bash
python app.py
```
Server runs on `http://localhost:7860`
### Production (Hugging Face Spaces)
```bash
uvicorn app:app --host 0.0.0.0 --port 7860
```
## API Endpoints
### 1. Health Check
```bash
GET /health
```
**Response:**
```json
{
"status": "healthy",
"models_loaded": true
}
```
### 2. Extract Single Invoice
```bash
POST /extract
```
**Parameters:**
- `file` (required): Image file (JPG, PNG, JPEG)
- `doc_id` (optional): Document identifier
**Example (cURL):**
```bash
curl -X POST "http://localhost:7860/extract" \
-F "file=@invoice_001.png" \
-F "doc_id=invoice_001"
```
**Response:**
```json
{
"doc_id": "invoice_001",
"fields": {
"dealer_name": "ABC Tractors Pvt Ltd",
"model_name": "Mahindra 575 DI",
"horse_power": 50,
"asset_cost": 525000,
"signature": {"present": true, "bbox": [100, 200, 300, 250]},
"stamp": {"present": true, "bbox": [400, 500, 500, 550]}
},
"confidence": 0.89,
"processing_time_sec": 3.8,
"cost_estimate_usd": 0.000528,
"warnings": null
}
```
### 3. Extract Multiple Invoices (Batch)
```bash
POST /extract_batch
```
**Parameters:**
- `files` (required): Array of image files
## Output Format
Results saved to `sample_output/result.json`:
```json
{
"doc_id": "invoice_001",
"fields": {
"dealer_name": "ABC Tractors Pvt Ltd",
"model_name": "Mahindra 575 DI",
"horse_power": 50,
"asset_cost": 525000,
"signature": {"present": true, "bbox": [100, 200, 300, 250]},
"stamp": {"present": true, "bbox": [400, 500, 500, 550]}
},
"confidence": 0.89,
"processing_time_sec": 3.8,
"cost_estimate_usd": 0.000528
}
```
Range: 0.0 to 1.0 (higher is better)
## Cost Calculation
**Formula:**
```
cost_usd = (0.5 * processing_time_sec) / 3600
```
Assumes **$0.60 per GPU hour**
**Typical costs:**
- Per invoice: ~$0.002
## Models
- **YOLO:** Signature/stamp detection (`best.pt`)
- **Qwen2.5-VL-7B:** Text extraction (4-bit quantized)
## GPU Requirements
- **Minimum:** 10 GB VRAM
## Project Structure
```
INVOICE_INFO_EXTRACTOR/
βββ app.py # FastAPI server (main entry point)
βββ model_manager.py # Model loading and caching
βββ inference.py # Processing pipeline and validation
βββ config.py # Configuration settings
βββ requirements.txt
βββ README.md
βββ executable.py # Legacy CLI (deprecated)
βββ utils/
β βββ models/
β βββ best.pt # YOLO model (stored locally)
βββ sample_output/
βββ result.json # Sample output
```
## Performance
- **Processing time:** ~8 seconds per invoice
- **Cost per invoice:** ~$0.002 (GPU time)
- **GPU Memory:** 8GB minimum |