ZedLow commited on
Commit
47f6bf1
·
verified ·
1 Parent(s): 9bf28f1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -165
README.md CHANGED
@@ -10,185 +10,38 @@ pinned: false
10
  license: mit
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
14
 
15
-
16
- # Financial RAG with Entity-Aware Routing & Vision-Native Tables
17
-
18
- **Portfolio project – Master 1 Artificial Intelligence**
19
-
20
- Design of a **constrained Financial RAG pipeline** focused on reducing hallucinations and accidental cross-document errors **through architecture, not prompt engineering**.
21
 
22
  ---
23
 
24
- ## ⚠️ Scope
25
-
26
- This is a **controlled demo system** operating on a small “golden dataset” (6 pages from the 2023 annual reports of **Apple** and **Microsoft**).
27
-
28
- The goal is to demonstrate **system design choices and failure modes**, not dataset scale or production readiness.
29
-
30
- ---
31
-
32
- ## Why this project exists
33
-
34
- Most RAG failures in finance are **structural**, not model-related:
35
-
36
- - Documents from different companies are mixed due to semantic similarity
37
- - Tables are flattened by OCR, breaking numeric alignment
38
- - Models answer confidently even when data is missing or ambiguous
39
-
40
- This project explores how **explicit constraints and routing** can eliminate entire classes of errors — and where those constraints **still break**.
41
-
42
- ---
43
-
44
- ## Design Philosophy
45
-
46
- ### Constraints > Prompts
47
-
48
- Instead of asking the model to “be careful”, the system:
49
-
50
- - restricts what can be retrieved
51
- - restricts what can be answered
52
- - rejects unsupported or ambiguous queries
53
 
54
- The objective is **inspectability and predictable failure**, not conversational flexibility.
 
 
 
55
 
56
  ---
57
 
58
- ## Key Properties
59
 
60
- - **Entity-aware routing (CPU)**
61
- Company entities are detected before retrieval.
62
 
63
- - **Corpus-level filtering**
64
- Single-entity queries are routed to a single company corpus.
 
 
65
 
66
- - **Vision-native table reasoning**
67
- Financial tables are processed as images to preserve structure (no OCR flattening).
68
-
69
- - **Explicit refusal**
70
- Out-of-scope or ambiguous queries fail loudly instead of hallucinating.
71
-
72
- - **Transparent pipeline**
73
- Retrieval, reranking, and source pages are fully visible in the UI.
74
 
75
  ---
76
 
77
- ## Architecture Overview
78
-
79
- ```mermaid
80
- flowchart LR
81
- Q[User Query] --> R{Entity Router (GLiNER • CPU)}
82
-
83
- R -->|Apple| A[Apple corpus only]
84
- R -->|Microsoft| M[Microsoft corpus only]
85
- R -->|Other / none| X[Reject (out of scope / ambiguous)]
86
-
87
- A --> D[Dense retrieval (gte-Qwen2-7B)]
88
- M --> D
89
 
90
- D --> K[Reranker (BGE-M3)]
91
- K --> V[Vision reasoning (Qwen2-VL)]
92
- V --> Y[Grounded answer]
93
-
94
- ## Observed System Behavior (Important)
95
-
96
- This section reflects **actual behavior observed in the demo**, not idealized guarantees.
97
-
98
- - Single-entity queries are routed to the corresponding company corpus only
99
- - Explicit multi-entity queries may trigger cross-company reasoning
100
  - Source-constrained prompts are not strictly enforced
101
- - Out-of-scope entities (Google, Tesla, etc.) are explicitly rejected
102
- - Implicit references (e.g. “Cupertino-based company”) are rejected
103
-
104
- The system prevents **accidental corpus mixing**, not intentional multi-entity analysis.
105
-
106
- ---
107
-
108
- ## What the system guarantees (by construction)
109
-
110
- - No hallucination outside the provided documents
111
- - No answers for unsupported entities
112
- - No numeric invention (values must appear in source pages)
113
- - Clear refusal when data is missing or ambiguous
114
-
115
- ---
116
-
117
- ## Technical Stack
118
-
119
- - **Entity Routing:** GLiNER (CPU)
120
- - **Dense Retrieval:** gte-Qwen2-7B
121
- - **Reranking:** BAAI/bge-reranker-v2-m3
122
- - **Vision Reasoning:** Qwen2-VL-2B
123
- - **UI:** Gradio (Hugging Face Spaces, ZeroGPU)
124
-
125
- **Design choice:**
126
- Embeddings are recomputed on-the-fly to keep the system **stateless and fully inspectable**.
127
-
128
- ---
129
-
130
- ## Demo
131
-
132
- ### ▶ Live Demo (Recommended)
133
-
134
- Deployed on **Hugging Face Spaces** using ZeroGPU.
135
- Link provided on CV.
136
-
137
- ### Local Execution
138
-
139
- Requires a GPU with ~24 GB VRAM.
140
-
141
- ```bash
142
- git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
143
- pip install -r requirements.txt
144
- python app.py
145
-
146
- ## Known Limitations & Trade-offs
147
-
148
- ### 1. Entity detection ≠ entity role understanding
149
-
150
- The router detects *which* entities are mentioned, not *how* they are used.
151
-
152
- Example:
153
-
154
- > “According to Apple documents, what is Microsoft’s revenue?”
155
-
156
- This may succeed because the system prioritizes answering a factual question over enforcing adversarial source constraints.
157
-
158
- **Production fix:** dependency parsing or semantic role labeling.
159
-
160
- ---
161
-
162
- ### 2. Explicit multi-entity reasoning is allowed
163
-
164
- Queries that intentionally involve multiple companies may produce cross-company aggregation.
165
-
166
- This is a **design choice**, not a bug.
167
-
168
- ---
169
-
170
- ### 3. Stateless retrieval
171
-
172
- No vector database is used.
173
-
174
- **Trade-off:**
175
- - Higher latency
176
- - Maximum transparency
177
-
178
- ---
179
-
180
- ## What this project demonstrates
181
-
182
- - Ability to design constrained RAG pipelines
183
- - Understanding of LLM failure modes
184
- - Practical multimodal reasoning on structured financial data
185
- - Clear separation between routing logic, retrieval, prompting, and UI
186
- - Engineering honesty about system guarantees vs limitations
187
-
188
- ---
189
-
190
- ## Context
191
-
192
- Built as part of a **Master 1 in Artificial Intelligence**.
193
- Learning-focused portfolio project — **not** a production financial advisory system.
194
 
 
 
10
  license: mit
11
  ---
12
 
13
+ # Financial RAG Demo
14
 
15
+ This demo showcases a **constrained Financial RAG pipeline** designed to reduce hallucinations through **explicit routing and hard constraints**, not prompt tricks.
 
 
 
 
 
16
 
17
  ---
18
 
19
+ ## What this demo does
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
+ - Routes queries based on detected company entities (Apple / Microsoft)
22
+ - Prevents accidental cross-company document mixing
23
+ - Processes financial tables as images to preserve structure
24
+ - Explicitly rejects unsupported or ambiguous queries
25
 
26
  ---
27
 
28
+ ## How to test it
29
 
30
+ Try the following queries:
 
31
 
32
+ - `What was Apple’s total revenue in 2023?`
33
+ - `What is Microsoft’s operating income?`
34
+ - `Compare Apple and Microsoft revenues` → rejected or limited
35
+ - `What was Google’s revenue in 2023?` → rejected
36
 
37
+ The UI shows retrieved pages and scores to make the pipeline inspectable.
 
 
 
 
 
 
 
38
 
39
  ---
40
 
41
+ ## Important limitations
 
 
 
 
 
 
 
 
 
 
 
42
 
43
+ - Explicit multi-company questions may trigger cross-entity reasoning
 
 
 
 
 
 
 
 
 
44
  - Source-constrained prompts are not strictly enforced
45
+ - Dataset is intentionally small (demo-only)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
+ For full technical details and design discussion, see the GitHub repository linked on the CV.