Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
|
@@ -15,30 +15,29 @@ pinned: false
|
|
| 15 |
[](https://www.python.org/downloads/release/python-3100/)
|
| 16 |
[](https://opensource.org/licenses/Apache-2.0)
|
| 17 |
|
| 18 |
-
> **High-performance Hybrid Search & Reranking Engine based on BGE-M3.** > An advanced knowledge retrieval API system
|
| 19 |
-
|
| 20 |
|
| 21 |
---
|
| 22 |
|
| 23 |
## π Key Features
|
| 24 |
-
* **Hybrid Search:** Seamlessly combines Dense & Sparse vector retrieval using Qdrant's Native Fusion API (BGE-M3).
|
| 25 |
-
* **Re-ranking:** Ensures top-tier precision by re-ordering search results via
|
| 26 |
-
* **
|
| 27 |
-
* **
|
| 28 |
-
* **
|
| 29 |
|
| 30 |
---
|
| 31 |
|
| 32 |
## π Project Structure
|
| 33 |
-
|
| 34 |
|
| 35 |
```text
|
| 36 |
-
βββ api/ # API Routing &
|
| 37 |
-
βββ core/ # Global Configuration (Pydantic
|
| 38 |
βββ models/ # AI Model Inference (Embedder, Reranker)
|
| 39 |
βββ services/ # Business Logic & Search Pipeline Orchestration
|
| 40 |
βββ storage/ # Infrastructure Layer (Qdrant, SQLite Clients)
|
| 41 |
-
βββ scripts/ # Data Pipeline &
|
| 42 |
βββ templates/ # Demo UI (Jinja2 Templates)
|
| 43 |
βββ main.py # App Entry Point & Lifespan Management
|
| 44 |
```
|
|
@@ -47,20 +46,21 @@ This project follows the **Separation of Concerns (SoC)** principle to ensure th
|
|
| 47 |
|
| 48 |
## π Tech Stack
|
| 49 |
* **Framework:** FastAPI
|
| 50 |
-
* **Vector DB:** Qdrant (
|
| 51 |
* **RDBMS:** SQLite (Metadata & Corpus Storage)
|
| 52 |
* **ML Models:**
|
| 53 |
-
* `BAAI/bge-m3`
|
| 54 |
-
* `BAAI/bge-reranker-v2-m3` (Cross-Encoder)
|
| 55 |
-
* **DevOps:** Docker, GitHub Actions, Hugging Face Hub
|
|
|
|
| 56 |
|
| 57 |
---
|
| 58 |
|
| 59 |
## π§ Installation & Setup
|
| 60 |
|
| 61 |
### Prerequisites
|
| 62 |
-
* Python 3.10
|
| 63 |
-
* Hugging Face Access Token (
|
| 64 |
|
| 65 |
### Running Locally
|
| 66 |
1. Clone the repository:
|
|
@@ -72,35 +72,48 @@ This project follows the **Separation of Concerns (SoC)** principle to ensure th
|
|
| 72 |
```bash
|
| 73 |
pip install -r requirements.txt
|
| 74 |
```
|
| 75 |
-
3. Run the application
|
|
|
|
| 76 |
```bash
|
| 77 |
python main.py
|
| 78 |
-
# OR
|
| 79 |
uvicorn main:app --host 0.0.0.0 --port 7860
|
| 80 |
```
|
| 81 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
---
|
| 83 |
|
| 84 |
## π‘ API Endpoints
|
| 85 |
| Method | Endpoint | Description |
|
| 86 |
| :--- | :--- | :--- |
|
| 87 |
| `GET` | `/` | Redirects to Search Demo UI |
|
| 88 |
-
| `POST` | `/api/v1/search/` | Executes JSON-based Hybrid Search |
|
| 89 |
| `GET` | `/api/v1/system/health/ping` | System health check (Heartbeat) |
|
| 90 |
|
| 91 |
---
|
| 92 |
|
| 93 |
## π‘ Architecture Insights
|
| 94 |
-
1. **
|
| 95 |
-
2. **
|
| 96 |
-
3. **Deployment Ready:** Optimized for PaaS environments (like HF Spaces) through a containerized Docker setup and automated CI/CD.
|
| 97 |
|
| 98 |
---
|
| 99 |
|
| 100 |
## π Documentation
|
| 101 |
-
For more detailed technical documentation
|
| 102 |
* [Personal Archive Link](https://minjae-portfolio.vercel.app/projects/ke)
|
| 103 |
* [Technical Design Blog](https://minjae-portfolio.vercel.app/blogs/ke-pd)
|
| 104 |
|
| 105 |
|
| 106 |
-
---
|
|
|
|
|
|
| 15 |
[](https://www.python.org/downloads/release/python-3100/)
|
| 16 |
[](https://opensource.org/licenses/Apache-2.0)
|
| 17 |
|
| 18 |
+
> **High-performance Hybrid Search & Reranking Engine based on BGE-M3.** > An advanced knowledge retrieval API system designed for Agentic AI, combining Dense/Sparse embeddings and optimizing precision with Cross-Encoders.
|
|
|
|
| 19 |
|
| 20 |
---
|
| 21 |
|
| 22 |
## π Key Features
|
| 23 |
+
* **Hybrid Search (RRF):** Seamlessly combines Dense & Sparse vector retrieval using Qdrant's Native Fusion API (BGE-M3).
|
| 24 |
+
* **Cross-Encoder Re-ranking:** Ensures top-tier precision by re-ordering search results contextually via `bge-reranker-v2-m3`.
|
| 25 |
+
* **Agent-Ready Output:** Natively provides XML-tagged context blocks optimized for immediate injection into LLMs and Agentic workflows.
|
| 26 |
+
* **Auto-Healing & Sync:** Robust startup logic via FastAPI `lifespan` that automatically pulls pre-processed knowledge bases from Hugging Face Datasets and synchronizes them.
|
| 27 |
+
* **Clean Architecture:** Highly modularized layers (API, Service, Storage, Models) using Dependency Injection for superior maintainability.
|
| 28 |
|
| 29 |
---
|
| 30 |
|
| 31 |
## π Project Structure
|
| 32 |
+
Follows the **Separation of Concerns (SoC)** principle to ensure the system remains extensible and testable.
|
| 33 |
|
| 34 |
```text
|
| 35 |
+
βββ api/ # API Routing & Schema Definitions
|
| 36 |
+
βββ core/ # Global Configuration (Pydantic V2) & Exception Handling
|
| 37 |
βββ models/ # AI Model Inference (Embedder, Reranker)
|
| 38 |
βββ services/ # Business Logic & Search Pipeline Orchestration
|
| 39 |
βββ storage/ # Infrastructure Layer (Qdrant, SQLite Clients)
|
| 40 |
+
βββ scripts/ # Data Pipeline & HF Dataset Sync Scripts
|
| 41 |
βββ templates/ # Demo UI (Jinja2 Templates)
|
| 42 |
βββ main.py # App Entry Point & Lifespan Management
|
| 43 |
```
|
|
|
|
| 46 |
|
| 47 |
## π Tech Stack
|
| 48 |
* **Framework:** FastAPI
|
| 49 |
+
* **Vector DB:** Qdrant (Server Mode)
|
| 50 |
* **RDBMS:** SQLite (Metadata & Corpus Storage)
|
| 51 |
* **ML Models:**
|
| 52 |
+
* [`BAAI/bge-m3`](https://huggingface.co/BAAI/bge-m3) (Dense + Sparse Embedding)
|
| 53 |
+
* [`BAAI/bge-reranker-v2-m3`](https://huggingface.co/BAAI/bge-reranker-v2-m3) (Cross-Encoder)
|
| 54 |
+
* **DevOps:** Docker, GitHub Actions, Hugging Face Hub (Spaces & Datasets)
|
| 55 |
+
* **Corpus:** [FineWiki](https://huggingface.co/datasets/HuggingFaceFW/finewiki)(Currently consists only of kowiki; enwiki, eswiki, etc. to be added later)
|
| 56 |
|
| 57 |
---
|
| 58 |
|
| 59 |
## π§ Installation & Setup
|
| 60 |
|
| 61 |
### Prerequisites
|
| 62 |
+
* Python 3.10+
|
| 63 |
+
* Hugging Face Access Token (For initial setup/updates)
|
| 64 |
|
| 65 |
### Running Locally
|
| 66 |
1. Clone the repository:
|
|
|
|
| 72 |
```bash
|
| 73 |
pip install -r requirements.txt
|
| 74 |
```
|
| 75 |
+
3. Run the application:
|
| 76 |
+
*(The system will automatically download the pre-built SQLite and Qdrant DB files from HF Datasets on startup via `scripts/setup_db.py`)*
|
| 77 |
```bash
|
| 78 |
python main.py
|
| 79 |
+
# OR
|
| 80 |
uvicorn main:app --host 0.0.0.0 --port 7860
|
| 81 |
```
|
| 82 |
|
| 83 |
+
### Preprocessing Pipeline (Optional)
|
| 84 |
+
If you want to build the knowledge base from scratch:
|
| 85 |
+
```bash
|
| 86 |
+
# 1. Download qdrant binary (Linux x86_64)
|
| 87 |
+
wget [https://github.com/qdrant/qdrant/releases/download/v1.16.2/qdrant-x86_64-unknown-linux-gnu.tar.gz](https://github.com/qdrant/qdrant/releases/download/v1.16.2/qdrant-x86_64-unknown-linux-gnu.tar.gz)
|
| 88 |
+
tar -xvf qdrant-x86_64-unknown-linux-gnu.tar.gz
|
| 89 |
+
chmod +x qdrant
|
| 90 |
+
|
| 91 |
+
# 2. Execute Pipeline
|
| 92 |
+
python scripts/data_pipeline.py --lang en --chunk_batch_size 10000 --limit 50000 --batch_size 1024 --workers 4 --upload --repo_id user/id
|
| 93 |
+
```
|
| 94 |
+
|
| 95 |
---
|
| 96 |
|
| 97 |
## π‘ API Endpoints
|
| 98 |
| Method | Endpoint | Description |
|
| 99 |
| :--- | :--- | :--- |
|
| 100 |
| `GET` | `/` | Redirects to Search Demo UI |
|
| 101 |
+
| `POST` | `/api/v1/search/` | Executes JSON-based Hybrid Search (Returns structured JSON & LLM context) |
|
| 102 |
| `GET` | `/api/v1/system/health/ping` | System health check (Heartbeat) |
|
| 103 |
|
| 104 |
---
|
| 105 |
|
| 106 |
## π‘ Architecture Insights
|
| 107 |
+
1. **O(1) Metadata Mapping:** By storing massive text payloads in SQLite and only vectors/IDs in Qdrant, we achieve extremely low latency during the reranking preparation phase.
|
| 108 |
+
2. **Zero-Downtime Deployment:** Optimized for PaaS environments (like HF Spaces) through a containerized Docker setup and a custom `start.sh` that ensures DB readiness before FastAPI starts.
|
|
|
|
| 109 |
|
| 110 |
---
|
| 111 |
|
| 112 |
## π Documentation
|
| 113 |
+
For more detailed technical documentation and design decisions:
|
| 114 |
* [Personal Archive Link](https://minjae-portfolio.vercel.app/projects/ke)
|
| 115 |
* [Technical Design Blog](https://minjae-portfolio.vercel.app/blogs/ke-pd)
|
| 116 |
|
| 117 |
|
| 118 |
+
---
|
| 119 |
+
|