Spaces:

ZedLow
/

Constrained-Financial-RAG

Sleeping

App Files Files Community

ZedLow commited on Feb 6

Commit

47f6bf1

verified ·

1 Parent(s): 9bf28f1

Update README.md

Browse files

Files changed (1) hide show

README.md +18 -165

README.md CHANGED Viewed

@@ -10,185 +10,38 @@ pinned: false
 license: mit
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
-# Financial RAG with Entity-Aware Routing & Vision-Native Tables
-**Portfolio project – Master 1 Artificial Intelligence**
-Design of a **constrained Financial RAG pipeline** focused on reducing hallucinations and accidental cross-document errors **through architecture, not prompt engineering**.
 ---
-## ⚠️ Scope
-This is a **controlled demo system** operating on a small “golden dataset” (6 pages from the 2023 annual reports of **Apple** and **Microsoft**).
-The goal is to demonstrate **system design choices and failure modes**, not dataset scale or production readiness.
----
-## Why this project exists
-Most RAG failures in finance are **structural**, not model-related:
-- Documents from different companies are mixed due to semantic similarity
-- Tables are flattened by OCR, breaking numeric alignment
-- Models answer confidently even when data is missing or ambiguous
-This project explores how **explicit constraints and routing** can eliminate entire classes of errors — and where those constraints **still break**.
----
-## Design Philosophy
-### Constraints > Prompts
-Instead of asking the model to “be careful”, the system:
-- restricts what can be retrieved
-- restricts what can be answered
-- rejects unsupported or ambiguous queries
-The objective is **inspectability and predictable failure**, not conversational flexibility.
 ---
-## Key Properties
-- **Entity-aware routing (CPU)**
-  Company entities are detected before retrieval.
-- **Corpus-level filtering**
-  Single-entity queries are routed to a single company corpus.
-- **Vision-native table reasoning**
-  Financial tables are processed as images to preserve structure (no OCR flattening).
-- **Explicit refusal**
-  Out-of-scope or ambiguous queries fail loudly instead of hallucinating.
-- **Transparent pipeline**
-  Retrieval, reranking, and source pages are fully visible in the UI.
 ---
-## Architecture Overview
-```mermaid
-flowchart LR
-    Q[User Query] --> R{Entity Router (GLiNER • CPU)}
-    R -->|Apple| A[Apple corpus only]
-    R -->|Microsoft| M[Microsoft corpus only]
-    R -->|Other / none| X[Reject (out of scope / ambiguous)]
-    A --> D[Dense retrieval (gte-Qwen2-7B)]
-    M --> D
-    D --> K[Reranker (BGE-M3)]
-    K --> V[Vision reasoning (Qwen2-VL)]
-    V --> Y[Grounded answer]
-## Observed System Behavior (Important)
-This section reflects **actual behavior observed in the demo**, not idealized guarantees.
-- Single-entity queries are routed to the corresponding company corpus only
-- Explicit multi-entity queries may trigger cross-company reasoning
 - Source-constrained prompts are not strictly enforced
-- Out-of-scope entities (Google, Tesla, etc.) are explicitly rejected
-- Implicit references (e.g. “Cupertino-based company”) are rejected
-The system prevents **accidental corpus mixing**, not intentional multi-entity analysis.
----
-## What the system guarantees (by construction)
-- No hallucination outside the provided documents
-- No answers for unsupported entities
-- No numeric invention (values must appear in source pages)
-- Clear refusal when data is missing or ambiguous
----
-## Technical Stack
-- **Entity Routing:** GLiNER (CPU)
-- **Dense Retrieval:** gte-Qwen2-7B
-- **Reranking:** BAAI/bge-reranker-v2-m3
-- **Vision Reasoning:** Qwen2-VL-2B
-- **UI:** Gradio (Hugging Face Spaces, ZeroGPU)
-**Design choice:**
-Embeddings are recomputed on-the-fly to keep the system **stateless and fully inspectable**.
----
-## Demo
-### ▶ Live Demo (Recommended)
-Deployed on **Hugging Face Spaces** using ZeroGPU.
-Link provided on CV.
-### Local Execution
-Requires a GPU with ~24 GB VRAM.
-```bash
-git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
-pip install -r requirements.txt
-python app.py
-## Known Limitations & Trade-offs
-### 1. Entity detection ≠ entity role understanding
-The router detects *which* entities are mentioned, not *how* they are used.
-Example:
-> “According to Apple documents, what is Microsoft’s revenue?”
-This may succeed because the system prioritizes answering a factual question over enforcing adversarial source constraints.
-**Production fix:** dependency parsing or semantic role labeling.
----
-### 2. Explicit multi-entity reasoning is allowed
-Queries that intentionally involve multiple companies may produce cross-company aggregation.
-This is a **design choice**, not a bug.
----
-### 3. Stateless retrieval
-No vector database is used.
-**Trade-off:**
-- Higher latency
-- Maximum transparency
----
-## What this project demonstrates
-- Ability to design constrained RAG pipelines
-- Understanding of LLM failure modes
-- Practical multimodal reasoning on structured financial data
-- Clear separation between routing logic, retrieval, prompting, and UI
-- Engineering honesty about system guarantees vs limitations
----
-## Context
-Built as part of a **Master 1 in Artificial Intelligence**.
-Learning-focused portfolio project — **not** a production financial advisory system.

 license: mit
 ---
+# Financial RAG Demo
+This demo showcases a **constrained Financial RAG pipeline** designed to reduce hallucinations through **explicit routing and hard constraints**, not prompt tricks.
 ---
+## What this demo does
+- Routes queries based on detected company entities (Apple / Microsoft)
+- Prevents accidental cross-company document mixing
+- Processes financial tables as images to preserve structure
+- Explicitly rejects unsupported or ambiguous queries
 ---
+## How to test it
+Try the following queries:
+- `What was Apple’s total revenue in 2023?`
+- `What is Microsoft’s operating income?`
+- `Compare Apple and Microsoft revenues` → rejected or limited
+- `What was Google’s revenue in 2023?` → rejected
+The UI shows retrieved pages and scores to make the pipeline inspectable.
 ---
+## Important limitations
+- Explicit multi-company questions may trigger cross-entity reasoning
 - Source-constrained prompts are not strictly enforced
+- Dataset is intentionally small (demo-only)
+For full technical details and design discussion, see the GitHub repository linked on the CV.