Sathvik-kota commited on
Commit
24a76a0
·
verified ·
1 Parent(s): 8a5f06b

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md CHANGED
@@ -119,6 +119,57 @@ Everything starts with **one command**:
119
  bash start.sh
120
 
121
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
122
 
123
 
124
  ## 🏗️ Architecture Overview
 
119
  bash start.sh
120
 
121
 
122
+ ## 🧬 Design Choices
123
+
124
+ ### 1️⃣ **Microservices instead of Monolithic**
125
+ - Real-world ML systems separate **indexing, embedding, routing, and inference**.
126
+ - Enables **independent scaling**, easier debugging, and service-level isolation.
127
+ - Perfect architecture to demonstrate **system design skills** in interviews.
128
+
129
+ ---
130
+
131
+ ### 2️⃣ **MiniLM Embeddings**
132
+ - ⚡ **Fast on CPU** (optimized for lightweight inference)
133
+ - 🎯 **High semantic quality** for short & long text
134
+ - 🪶 **Small model** → ideal for search engines, mobile, Spaces deployments
135
+
136
+ ---
137
+
138
+ ### 3️⃣ **FAISS L2 on Normalized Embeddings**
139
+ L2 distance is used instead of cosine because:
140
+
141
+ - 🚀 **FAISS FlatL2 is faster** and more optimized
142
+ - ✨ When vectors are normalized:
143
+ `L2 Distance ≡ Cosine Distance` (mathematically equivalent)
144
+ - 🧩 Avoids the overhead of cosine kernels
145
+
146
+ ---
147
+
148
+ ### 4️⃣ **Local Embedding Cache**
149
+ - Reduces startup time from **~5 seconds → <1 second**
150
+ - Prevents **re-embedding identical documents**
151
+ - Stores:
152
+ - `embed_meta.json` → filename → hash → index
153
+ - `embeddings.npy` → matrix of stored embeddings
154
+ - Saves compute + makes repeated searches much faster
155
+
156
+ ---
157
+
158
+ ### 5️⃣ **LLM-Driven Explainability**
159
+ - Generates **human-friendly reasoning**
160
+ - Explains **why a document matched your query**
161
+ - Combines:
162
+ - Top semantic-matching sentences
163
+ - Keyword overlap
164
+ - Gemini’s natural-language reasoning
165
+
166
+ ---
167
+
168
+ ### 6️⃣ **Streamlit for Fast UI**
169
+ - ⚡ Instant reload during development
170
+ - 🎨 Clean layout for Gemini-style cards
171
+ - 🧱 Easy to extend (evaluation panel, metrics, expanders)
172
+
173
 
174
 
175
  ## 🏗️ Architecture Overview