Spaces:
Sleeping
Sleeping
metadata
title: Knowledge Engine
emoji: π
colorFrom: purple
colorTo: gray
sdk: docker
app_port: 7860
license: apache-2.0
pinned: false
π Knowledge Engine
High-performance Hybrid Search & Reranking Engine based on BGE-M3. > An advanced knowledge retrieval API system designed for Agentic AI, combining Dense/Sparse embeddings and optimizing precision with Cross-Encoders.
π Key Features
- Hybrid Search (RRF): Seamlessly combines Dense & Sparse vector retrieval using Qdrant's Native Fusion API (BGE-M3).
- Cross-Encoder Re-ranking: Ensures top-tier precision by re-ordering search results contextually via
bge-reranker-v2-m3. - Agent-Ready Output: Natively provides XML-tagged context blocks optimized for immediate injection into LLMs and Agentic workflows.
- Auto-Healing & Sync: Robust startup logic via FastAPI
lifespanthat automatically pulls pre-processed knowledge bases from Hugging Face Datasets and synchronizes them. - Clean Architecture: Highly modularized layers (API, Service, Storage, Models) using Dependency Injection for superior maintainability.
π Project Structure
Follows the Separation of Concerns (SoC) principle to ensure the system remains extensible and testable.
βββ api/ # API Routing & Schema Definitions
βββ core/ # Global Configuration (Pydantic V2) & Exception Handling
βββ models/ # AI Model Inference (Embedder, Reranker)
βββ services/ # Business Logic & Search Pipeline Orchestration
βββ storage/ # Infrastructure Layer (Qdrant, SQLite Clients)
βββ scripts/ # Data Pipeline & HF Dataset Sync Scripts
βββ templates/ # Demo UI (Jinja2 Templates)
βββ main.py # App Entry Point & Lifespan Management
π Tech Stack
- Framework: FastAPI
- Vector DB: Qdrant (Server Mode)
- RDBMS: SQLite (Metadata & Corpus Storage)
- ML Models:
BAAI/bge-m3(Dense + Sparse Embedding)BAAI/bge-reranker-v2-m3(Cross-Encoder)
- DevOps: Docker, GitHub Actions, Hugging Face Hub (Spaces & Datasets)
- Corpus: FineWiki(Currently consists only of kowiki; enwiki, eswiki, etc. to be added later)
π§ Installation & Setup
Prerequisites
- Python 3.10+
- Hugging Face Access Token (For initial setup/updates)
Running Locally
- Clone the repository:
git clone [https://github.com/m97j/knowledge-engine.git](https://github.com/m97j/knowledge-engine.git) cd knowledge-engine - Install dependencies:
pip install -r requirements.txt - Run the application:
(The system will automatically download the pre-built SQLite and Qdrant DB files from HF Datasets on startup via
scripts/setup_db.py)python main.py # OR uvicorn main:app --host 0.0.0.0 --port 7860
Preprocessing Pipeline (Optional)
If you want to build the knowledge base from scratch:
# 1. Download qdrant binary (Linux x86_64)
wget [https://github.com/qdrant/qdrant/releases/download/v1.16.2/qdrant-x86_64-unknown-linux-gnu.tar.gz](https://github.com/qdrant/qdrant/releases/download/v1.16.2/qdrant-x86_64-unknown-linux-gnu.tar.gz)
tar -xvf qdrant-x86_64-unknown-linux-gnu.tar.gz
chmod +x qdrant
# 2. Execute Pipeline
python scripts/data_pipeline.py --lang en --chunk_batch_size 10000 --limit 50000 --batch_size 1024 --workers 4 --upload --repo_id user/id
π‘ API Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET |
/ |
Redirects to Search Demo UI |
POST |
/api/v1/search/ |
Executes JSON-based Hybrid Search (Returns structured JSON & LLM context) |
GET |
/api/v1/system/health/ping |
System health check (Heartbeat) |
π‘ Architecture Insights
- O(1) Metadata Mapping: By storing massive text payloads in SQLite and only vectors/IDs in Qdrant, we achieve extremely low latency during the reranking preparation phase.
- Zero-Downtime Deployment: Optimized for PaaS environments (like HF Spaces) through a containerized Docker setup and a custom
start.shthat ensures DB readiness before FastAPI starts.
π Documentation
For more detailed technical documentation and design decisions: