ZedLow commited on
Commit
80ff8d2
·
verified ·
1 Parent(s): f464881

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +176 -10
README.md CHANGED
@@ -1,13 +1,179 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: Finance RAG Analyst
3
- emoji: 📉
4
- colorFrom: purple
5
- colorTo: green
6
- sdk: gradio
7
- sdk_version: 6.5.1
8
- app_file: app.py
9
- pinned: false
10
- license: mit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🛡️ Financial RAG with Entity-Aware Routing & Vision-Native Tables
2
+
3
+ **Portfolio project – Master 1 Artificial Intelligence**
4
+
5
+ Design of a **constrained Financial RAG pipeline** focused on reducing hallucinations and accidental cross-document errors **through architecture, not prompt engineering**.
6
+
7
+ ---
8
+
9
+ ## ⚠️ Scope
10
+
11
+ This is a **controlled demo system** operating on a small “golden dataset” (6 pages from the 2023 annual reports of **Apple** and **Microsoft**).
12
+
13
+ The goal is to demonstrate **system design choices and failure modes**, not dataset scale or production readiness.
14
+
15
+ ---
16
+
17
+ ## Why this project exists
18
+
19
+ Most RAG failures in finance are **structural**, not model-related:
20
+
21
+ - Documents from different companies are mixed due to semantic similarity
22
+ - Tables are flattened by OCR, breaking numeric alignment
23
+ - Models answer confidently even when data is missing or ambiguous
24
+
25
+ This project explores how **explicit constraints and routing** can eliminate entire classes of errors — and where those constraints **still break**.
26
+
27
+ ---
28
+
29
+ ## Design Philosophy
30
+
31
+ ### Constraints > Prompts
32
+
33
+ Instead of asking the model to “be careful”, the system:
34
+
35
+ - restricts what can be retrieved
36
+ - restricts what can be answered
37
+ - rejects unsupported or ambiguous queries
38
+
39
+ The objective is **inspectability and predictable failure**, not conversational flexibility.
40
+
41
+ ---
42
+
43
+ ## Key Properties
44
+
45
+ - 🧠 **Entity-aware routing (CPU)**
46
+ Company entities are detected before retrieval.
47
+
48
+ - 🔒 **Corpus-level filtering**
49
+ Single-entity queries are routed to a single company corpus.
50
+
51
+ - 👁️ **Vision-native table reasoning**
52
+ Financial tables are processed as images to preserve structure (no OCR flattening).
53
+
54
+ - 🚫 **Explicit refusal**
55
+ Out-of-scope or ambiguous queries fail loudly instead of hallucinating.
56
+
57
+ - 🔍 **Transparent pipeline**
58
+ Retrieval, reranking, and source pages are fully visible in the UI.
59
+
60
  ---
61
+
62
+ ## Architecture Overview
63
+
64
+ ```mermaid
65
+ flowchart LR
66
+ Q[User Query] --> R{Entity Router (GLiNER • CPU)}
67
+
68
+ R -->|Apple| A[Apple corpus only]
69
+ R -->|Microsoft| M[Microsoft corpus only]
70
+ R -->|Other / none| X[Reject (out of scope / ambiguous)]
71
+
72
+ A --> D[Dense retrieval (gte-Qwen2-7B)]
73
+ M --> D
74
+
75
+ D --> K[Reranker (BGE-M3)]
76
+ K --> V[Vision reasoning (Qwen2-VL)]
77
+ V --> Y[Grounded answer]
78
+
79
+ ## Observed System Behavior (Important)
80
+
81
+ This section reflects **actual behavior observed in the demo**, not idealized guarantees.
82
+
83
+ - Single-entity queries are routed to the corresponding company corpus only
84
+ - Explicit multi-entity queries may trigger cross-company reasoning
85
+ - Source-constrained prompts are not strictly enforced
86
+ - Out-of-scope entities (Google, Tesla, etc.) are explicitly rejected
87
+ - Implicit references (e.g. “Cupertino-based company”) are rejected
88
+
89
+ The system prevents **accidental corpus mixing**, not intentional multi-entity analysis.
90
+
91
  ---
92
 
93
+ ## What the system guarantees (by construction)
94
+
95
+ - No hallucination outside the provided documents
96
+ - No answers for unsupported entities
97
+ - No numeric invention (values must appear in source pages)
98
+ - Clear refusal when data is missing or ambiguous
99
+
100
+ ---
101
+
102
+ ## Technical Stack
103
+
104
+ - **Entity Routing:** GLiNER (CPU)
105
+ - **Dense Retrieval:** gte-Qwen2-7B
106
+ - **Reranking:** BAAI/bge-reranker-v2-m3
107
+ - **Vision Reasoning:** Qwen2-VL-2B
108
+ - **UI:** Gradio (Hugging Face Spaces, ZeroGPU)
109
+
110
+ **Design choice:**
111
+ Embeddings are recomputed on-the-fly to keep the system **stateless and fully inspectable**.
112
+
113
+ ---
114
+
115
+ ## Demo
116
+
117
+ ### ▶ Live Demo (Recommended)
118
+
119
+ Deployed on **Hugging Face Spaces** using ZeroGPU.
120
+ Link provided on CV.
121
+
122
+ ### 💻 Local Execution
123
+
124
+ Requires a GPU with ~24 GB VRAM.
125
+
126
+ ```bash
127
+ git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
128
+ pip install -r requirements.txt
129
+ python app.py
130
+
131
+ ## Known Limitations & Trade-offs
132
+
133
+ ### 1. Entity detection ≠ entity role understanding
134
+
135
+ The router detects *which* entities are mentioned, not *how* they are used.
136
+
137
+ Example:
138
+
139
+ > “According to Apple documents, what is Microsoft’s revenue?”
140
+
141
+ This may succeed because the system prioritizes answering a factual question over enforcing adversarial source constraints.
142
+
143
+ **Production fix:** dependency parsing or semantic role labeling.
144
+
145
+ ---
146
+
147
+ ### 2. Explicit multi-entity reasoning is allowed
148
+
149
+ Queries that intentionally involve multiple companies may produce cross-company aggregation.
150
+
151
+ This is a **design choice**, not a bug.
152
+
153
+ ---
154
+
155
+ ### 3. Stateless retrieval
156
+
157
+ No vector database is used.
158
+
159
+ **Trade-off:**
160
+ - Higher latency
161
+ - Maximum transparency
162
+
163
+ ---
164
+
165
+ ## What this project demonstrates
166
+
167
+ - Ability to design constrained RAG pipelines
168
+ - Understanding of LLM failure modes
169
+ - Practical multimodal reasoning on structured financial data
170
+ - Clear separation between routing logic, retrieval, prompting, and UI
171
+ - Engineering honesty about system guarantees vs limitations
172
+
173
+ ---
174
+
175
+ ## Context
176
+
177
+ Built as part of a **Master 1 in Artificial Intelligence**.
178
+ Learning-focused portfolio project — **not** a production financial advisory system.
179
+