Spaces:

dashVector
/

dashVectorSpace

Sleeping

App Files Files Community

dashVectorSpace / README.md

justmotes

Deploy 9-Row Benchmark (via API)

9a9f1fb verified 8 days ago

preview code

raw

history blame

3.28 kB

	# dashVectorspace (xVector)

	Production-Grade Learned Hybrid Retrieval Engine

	This project implements a high-efficiency vector search engine using a Learned Router and Custom Sharding on top of Qdrant. It optimizes search efficiency by ~90% by routing queries to specific data clusters instead of performing a brute-force search across the entire dataset.

	## Core Architecture

	1. The Brain (Router): A Machine Learning model (LightGBM/Logistic/MLP) predicts which cluster contains the answer.
	2. The Body (Vector DB): Qdrant with Custom Sharding. Data is partitioned into 32 Clusters + 1 Freshness Shard.
	3. The Optimization: Matryoshka Representation Learning (MRL). The Router uses sliced 64-dim vectors for speed, while the DB stores full vectors for accuracy.

	## Project Structure

	```
	dashVectorspace/
	├── config.py # Configuration (Clusters, Models, Paths)
	├── main.py # Benchmark Runner (P&C Matrix)
	├── requirements.txt # Dependencies
	├── src/
	│ ├── data_pipeline.py # Data loading & MRL slicing
	│ ├── router.py # LearnedRouter (Train/Predict)
	│ ├── vector_db.py # UnifiedQdrant (Custom Sharding)
	│ └── active_learning.py # Hard Negative Logging
	└── notebooks/
	└── xVector_Analysis.ipynb # Analysis Notebook
	```

	## Setup & Usage

	1. Install Dependencies:
	```bash
	pip install -r requirements.txt
	```

	2. Run Benchmarks:
	Execute the main script to run the Permutation & Combination matrix of experiments:
	```bash
	python main.py
	```
	This will:
	- Generate/Load Data (MS MARCO or Synthetic).
	- Train different Router models (LightGBM, Logistic, MLP).
	- Index data into Qdrant with Custom Sharding.
	- Run test queries and report Accuracy, Latency, and Compute Savings.

	3. Analyze Results:
	Open `notebooks/xVector_Analysis.ipynb` to visualize the active learning logs and performance metrics.

	## Key Features

	- Custom Sharding: Explicit control over where data lives (Clusters 0-31) and a dedicated Freshness Shard (999) for new data.
	- Drift Defense:
	- Layer 1: Always searches the Freshness Shard.
	- Layer 2: Falls back to Global Search if Router confidence is low (< 0.5).
	- Active Learning: Logs "Hard Negatives" (low confidence or zero results) to `logs/active_learning_queue.jsonl` for future model retraining.

	## Hugging Face Space Deployment

	This project is ready for deployment on Hugging Face Spaces.

	1. Create a New Space: Select "Gradio" as the SDK.
	2. Upload Files: Upload the entire `dashVectorspace` folder content to the Space.
	3. Set Secrets: Go to "Settings" -> "Repository secrets" and add:
	- `QDRANT_URL`: Your Qdrant Cloud Cluster URL.
	- `QDRANT_API_KEY`: Your Qdrant Cloud API Key.
	4. Ingest Data:
	- Run `python scripts/ingest_ms_marco.py` locally (with env vars set) to populate your Qdrant Cloud instance and generate `models/router_v1.pkl`.
	- Upload `models/router_v1.pkl` to the Space (inside a `models/` folder).
	5. Run: The Space will automatically launch `app.py`.