Sathvik-kota commited on
Commit
b7f70b9
·
verified ·
1 Parent(s): b8f1779

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +20 -15
README.md CHANGED
@@ -68,8 +68,8 @@ Metrics included:
68
  - **Correct vs Incorrect Fetches**
69
 
70
  ---
71
- #How Caching Works
72
- Caching happens inside **`embed_service/cache_manager.py`**.
73
 
74
  ### ✔ Prevents re-embedding unchanged files
75
  Each document is identified by: filename + MD5(clean_text)
@@ -91,7 +91,7 @@ Caching benefits:
91
 
92
  ---
93
 
94
- # 🧠 How to Run Embedding Generation
95
  ### Embedding happens automatically during **initialization**:
96
 
97
  `POST /initialize` (handled by API Gateway):
@@ -108,7 +108,7 @@ POST /embed_batch
108
  POST /embed_document
109
 
110
  ---
111
- ### 🧩 FAISS Persistence (Warm Start Optimization)
112
 
113
  The system stores embeddings **and** the FAISS vector index on disk:
114
 
@@ -131,19 +131,19 @@ On startup, the `search_service` automatically runs:
131
  ---
132
 
133
  ### 2️⃣ **MiniLM Embeddings**
134
- -**Fast on CPU** (optimized for lightweight inference)
135
- - 🎯 **High semantic quality** for short & long text
136
- - 🪶 **Small model** → ideal for search engines, mobile, Spaces deployments
137
 
138
  ---
139
 
140
  ### 3️⃣ **FAISS L2 on Normalized Embeddings**
141
  L2 distance is used instead of cosine because:
142
 
143
- - 🚀 **FAISS FlatL2 is faster** and more optimized
144
- - When vectors are normalized:
145
  `L2 Distance ≡ Cosine Distance` (mathematically equivalent)
146
- - 🧩 Avoids the overhead of cosine kernels
147
 
148
  ---
149
 
@@ -156,7 +156,12 @@ L2 distance is used instead of cosine because:
156
  - Saves compute + makes repeated searches much faster
157
 
158
  ---
159
-
 
 
 
 
 
160
  ### 5️⃣ **LLM-Driven Explainability**
161
  - Generates **human-friendly reasoning**
162
  - Explains **why a document matched your query**
@@ -168,13 +173,13 @@ L2 distance is used instead of cosine because:
168
  ---
169
 
170
  ### 6️⃣ **Streamlit for Fast UI**
171
- -Instant reload during development
172
- - 🎨 Clean layout
173
- - 🧱 Easy to extend (evaluation panel, metrics, expanders)
174
 
175
 
176
 
177
- ## 🏗️ Architecture Overview
178
 
179
  ### High-level Flow
180
 
 
68
  - **Correct vs Incorrect Fetches**
69
 
70
  ---
71
+ # How Caching Works
72
+ Caching happens inside **`embed_service/cache_manager.py`**.We never embed the same document twice.
73
 
74
  ### ✔ Prevents re-embedding unchanged files
75
  Each document is identified by: filename + MD5(clean_text)
 
91
 
92
  ---
93
 
94
+ # How to Run Embedding Generation
95
  ### Embedding happens automatically during **initialization**:
96
 
97
  `POST /initialize` (handled by API Gateway):
 
108
  POST /embed_document
109
 
110
  ---
111
+ ### FAISS Persistence (Warm Start Optimization)
112
 
113
  The system stores embeddings **and** the FAISS vector index on disk:
114
 
 
131
  ---
132
 
133
  ### 2️⃣ **MiniLM Embeddings**
134
+ - **Fast on CPU** (optimized for lightweight inference)
135
+ - **High semantic quality** for short & long text
136
+ - **Small model** → ideal for search engines, mobile, Spaces deployments
137
 
138
  ---
139
 
140
  ### 3️⃣ **FAISS L2 on Normalized Embeddings**
141
  L2 distance is used instead of cosine because:
142
 
143
+ - **FAISS FlatL2 is faster** and more optimized
144
+ - When vectors are normalized:
145
  `L2 Distance ≡ Cosine Distance` (mathematically equivalent)
146
+ - Avoids the overhead of cosine kernels
147
 
148
  ---
149
 
 
156
  - Saves compute + makes repeated searches much faster
157
 
158
  ---
159
+ ### 4️⃣FAISS Persistence (Warm Start Optimization)
160
+ - Eliminates the need to rebuild index on each startup
161
+ - Warm-loads instantly using try_load()
162
+ - Ideal for Spaces & Docker environments
163
+ - A vector-database
164
+ ---
165
  ### 5️⃣ **LLM-Driven Explainability**
166
  - Generates **human-friendly reasoning**
167
  - Explains **why a document matched your query**
 
173
  ---
174
 
175
  ### 6️⃣ **Streamlit for Fast UI**
176
+ - Instant reload during development
177
+ - Clean layout
178
+ - Easy to extend (evaluation panel, metrics, expanders)
179
 
180
 
181
 
182
+ ## Architecture Overview
183
 
184
  ### High-level Flow
185