| name: invoice_processing_pipeline |
| version: "1.0.0" |
| description: > |
| A self-improving 5-agent adversarial RL environment for invoice fraud detection. |
| A cross-episode Regulator monitors the Auditor's blind spots and biases the Generator |
| to produce harder fraud — closing a self-improvement loop without human intervention. |
| 10 tasks from easy extraction to 20-step long-horizon investigations and adaptive |
| personalized curricula. |
| |
| author: "Pritam Satpathy & Gnana Nawin T" |
| license: "MIT" |
|
|
| tags: |
| - openenv |
| - invoice |
| - fraud-detection |
| - multi-agent |
| - self-improvement |
| - grpo |
| - finance |
| - curriculum |
|
|
| environment: |
| module: server.app |
| class: InvoiceEnvironment |
| action: models.InvoiceAction |
| observation: models.InvoiceObservation |
|
|
| tasks: |
| - id: easy |
| name: "Single Invoice Extraction" |
| description: "Extract structured fields (vendor, date, currency, total, line items) from a single invoice." |
| difficulty: easy |
|
|
| - id: medium |
| name: "Batch Invoice Cleaning" |
| description: "Clean and normalise a batch of messy invoices: fix dates, vendor typos, currency codes, and amounts." |
| difficulty: medium |
|
|
| - id: hard |
| name: "Invoice-PO Reconciliation" |
| description: "Extract, clean, and reconcile invoices against purchase orders. Flag overcharges, extra items, and missing items." |
| difficulty: hard |
|
|
| - id: expert |
| name: "Invoice Fraud Audit" |
| description: "Detect fraudulent invoices using approved vendor registry, market price catalog, and invoice history. Classify fraud type: phantom_vendor, price_gouging, duplicate_submission, or math_fraud." |
| difficulty: expert |
|
|
| - id: adversarial |
| name: "Adversarial Invoice Extraction" |
| description: "Extract from an invoice with OCR corruption (0→O, 1→l, 5→S), a misleading SUBTOTAL trap, fabricated TAX/ADJUSTMENT lines, and a multi-currency FX noise line. The TOTAL line is always correct." |
| difficulty: hard |
|
|
| - id: negotiate |
| name: "Negotiated Invoice Clarification" |
| description: "Ask clarification questions (submit {question: str}) about an ambiguous invoice, then submit full structured extraction. Bonus awarded for solving correctly with ≤2 questions." |
| difficulty: medium |
|
|
| - id: supply_chain |
| name: "Supply Chain Anomaly Detection" |
| description: "Identify quantity shortfalls, price spikes, unauthorized substitutions, and phantom deliveries in a set of supply chain delivery records." |
| difficulty: expert |
|
|
| - id: long_horizon |
| name: "Long-Horizon Financial Investigation" |
| description: "20-step, 4-phase investigation with sparse rewards. Phase 1: extract 3 invoices. Phase 2: reconcile against POs (unlocked). Phase 3: fraud audit (registry unlocked). Phase 4: risk forecast. Each phase completion required to unlock next phase's reference data." |
| difficulty: expert |
|
|
| - id: personalized |
| name: "Personalized Adaptive Task" |
| description: "Tracks per-field accuracy (vendor, date, math, completeness) across steps and generates the next invoice to target the agent's weakest field. Reward weighted toward the historically weakest category." |
| difficulty: adaptive |
|
|
| - id: curriculum |
| name: "Auto-Progressive Curriculum" |
| description: "Automatically progresses the agent through easy→medium→hard→expert based on score. Score ≥0.80 to advance to next stage. Score <0.40 to be held back. Up to 20 steps across all stages." |
| difficulty: adaptive |
|
|
| endpoints: |
| reset: /reset |
| step: /step |
| state: /state |
| health: /health |
| grader: /grader |
| tasks: /tasks |
| metrics: /metrics |
| websocket: /ws |
|
|
| multi_agent_endpoints: |
| multi_reset: /multi/reset |
| multi_extract: /multi/extract |
| multi_audit: /multi/audit |
| multi_approve: /multi/approve |
| multi_state: /multi/state/{episode_id} |
|
|
| regulator_endpoints: |
| report: /regulator/report |
| forecast: /regulator/forecast |
| calibration: /regulator/calibration |
| predict: /regulator/predict |
| generator_score: /generator/score |
|
|
| ui: |
| gradio: /web |
| swagger: /docs |