Spaces:
Sleeping
Sleeping
File size: 4,974 Bytes
b62e029 972e4b1 b62e029 972e4b1 b62e029 972e4b1 b62e029 972e4b1 b62e029 972e4b1 b62e029 972e4b1 b62e029 972e4b1 b62e029 972e4b1 b62e029 972e4b1 b62e029 972e4b1 b62e029 972e4b1 b62e029 972e4b1 b62e029 972e4b1 b62e029 972e4b1 b62e029 972e4b1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | ---
title: Knowledge Engine
emoji: π
colorFrom: purple
colorTo: gray
sdk: docker
app_port: 7860
license: apache-2.0
pinned: false
---
# π Knowledge Engine
[](https://huggingface.co/spaces/m97j/knowledge-engine)
[](https://www.python.org/downloads/release/python-3100/)
[](https://opensource.org/licenses/Apache-2.0)
> **High-performance Hybrid Search & Reranking Engine based on BGE-M3.** > An advanced knowledge retrieval API system designed for Agentic AI, combining Dense/Sparse embeddings and optimizing precision with Cross-Encoders.
---
## π Key Features
* **Hybrid Search (RRF):** Seamlessly combines Dense & Sparse vector retrieval using Qdrant's Native Fusion API (BGE-M3).
* **Cross-Encoder Re-ranking:** Ensures top-tier precision by re-ordering search results contextually via `bge-reranker-v2-m3`.
* **Agent-Ready Output:** Natively provides XML-tagged context blocks optimized for immediate injection into LLMs and Agentic workflows.
* **Auto-Healing & Sync:** Robust startup logic via FastAPI `lifespan` that automatically pulls pre-processed knowledge bases from Hugging Face Datasets and synchronizes them.
* **Clean Architecture:** Highly modularized layers (API, Service, Storage, Models) using Dependency Injection for superior maintainability.
---
## π Project Structure
Follows the **Separation of Concerns (SoC)** principle to ensure the system remains extensible and testable.
```text
βββ api/ # API Routing & Schema Definitions
βββ core/ # Global Configuration (Pydantic V2) & Exception Handling
βββ models/ # AI Model Inference (Embedder, Reranker)
βββ services/ # Business Logic & Search Pipeline Orchestration
βββ storage/ # Infrastructure Layer (Qdrant, SQLite Clients)
βββ scripts/ # Data Pipeline & HF Dataset Sync Scripts
βββ templates/ # Demo UI (Jinja2 Templates)
βββ main.py # App Entry Point & Lifespan Management
```
---
## π Tech Stack
* **Framework:** FastAPI
* **Vector DB:** Qdrant (Server Mode)
* **RDBMS:** SQLite (Metadata & Corpus Storage)
* **ML Models:**
* [`BAAI/bge-m3`](https://huggingface.co/BAAI/bge-m3) (Dense + Sparse Embedding)
* [`BAAI/bge-reranker-v2-m3`](https://huggingface.co/BAAI/bge-reranker-v2-m3) (Cross-Encoder)
* **DevOps:** Docker, GitHub Actions, Hugging Face Hub (Spaces & Datasets)
* **Corpus:** [FineWiki](https://huggingface.co/datasets/HuggingFaceFW/finewiki)(Currently consists only of kowiki; enwiki, eswiki, etc. to be added later)
---
## π§ Installation & Setup
### Prerequisites
* Python 3.10+
* Hugging Face Access Token (For initial setup/updates)
### Running Locally
1. Clone the repository:
```bash
git clone [https://github.com/m97j/knowledge-engine.git](https://github.com/m97j/knowledge-engine.git)
cd knowledge-engine
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Run the application:
*(The system will automatically download the pre-built SQLite and Qdrant DB files from HF Datasets on startup via `scripts/setup_db.py`)*
```bash
python main.py
# OR
uvicorn main:app --host 0.0.0.0 --port 7860
```
### Preprocessing Pipeline (Optional)
If you want to build the knowledge base from scratch:
```bash
# 1. Download qdrant binary (Linux x86_64)
wget [https://github.com/qdrant/qdrant/releases/download/v1.16.2/qdrant-x86_64-unknown-linux-gnu.tar.gz](https://github.com/qdrant/qdrant/releases/download/v1.16.2/qdrant-x86_64-unknown-linux-gnu.tar.gz)
tar -xvf qdrant-x86_64-unknown-linux-gnu.tar.gz
chmod +x qdrant
# 2. Execute Pipeline
python scripts/data_pipeline.py --lang en --chunk_batch_size 10000 --limit 50000 --batch_size 1024 --workers 4 --upload --repo_id user/id
```
---
## π‘ API Endpoints
| Method | Endpoint | Description |
| :--- | :--- | :--- |
| `GET` | `/` | Redirects to Search Demo UI |
| `POST` | `/api/v1/search/` | Executes JSON-based Hybrid Search (Returns structured JSON & LLM context) |
| `GET` | `/api/v1/system/health/ping` | System health check (Heartbeat) |
---
## π‘ Architecture Insights
1. **O(1) Metadata Mapping:** By storing massive text payloads in SQLite and only vectors/IDs in Qdrant, we achieve extremely low latency during the reranking preparation phase.
2. **Zero-Downtime Deployment:** Optimized for PaaS environments (like HF Spaces) through a containerized Docker setup and a custom `start.sh` that ensures DB readiness before FastAPI starts.
---
## π Documentation
For more detailed technical documentation and design decisions:
* [Personal Archive Link](https://minjae-portfolio.vercel.app/projects/ke)
* [Technical Design Blog](https://minjae-portfolio.vercel.app/blogs/ke-pd)
---
|