Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,79 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
library_name: custom
|
| 3 |
+
tags:
|
| 4 |
+
- vector-search
|
| 5 |
+
- hnsw
|
| 6 |
+
- nearest-neighbor
|
| 7 |
+
- information-retrieval
|
| 8 |
+
- from-scratch
|
| 9 |
+
license: mit
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
# HNSW Vector Engine
|
| 13 |
+
|
| 14 |
+
A zero-dependency implementation of Hierarchical Navigable Small World (HNSW) approximate nearest-neighbor search, built directly from the [Malkov & Yashunin 2018 paper](https://arxiv.org/abs/1603.09320).
|
| 15 |
+
|
| 16 |
+
Part of [Citadel](https://github.com/dbhavery/citadel), an open-source AI operations platform.
|
| 17 |
+
|
| 18 |
+
## Why From Scratch?
|
| 19 |
+
|
| 20 |
+
Most vector search tools wrap existing libraries (FAISS, Annoy, HNSWlib). This implementation builds the full HNSW index from first principles — multi-layer graph construction, greedy search with backtracking, configurable M/ef parameters — to demonstrate deep understanding of the algorithm, not just API usage.
|
| 21 |
+
|
| 22 |
+
## Features
|
| 23 |
+
|
| 24 |
+
- **Multi-layer graph** with probabilistic level assignment
|
| 25 |
+
- **Greedy beam search** with configurable ef (exploration factor)
|
| 26 |
+
- **Cosine similarity** distance metric
|
| 27 |
+
- **Batch insert and query** with REST API wrapper
|
| 28 |
+
- **Persistent storage** to disk
|
| 29 |
+
- **18 tests** covering index construction, search accuracy, edge cases
|
| 30 |
+
|
| 31 |
+
## Parameters
|
| 32 |
+
|
| 33 |
+
| Parameter | Default | Description |
|
| 34 |
+
|-----------|---------|-------------|
|
| 35 |
+
| `M` | 16 | Max connections per node per layer |
|
| 36 |
+
| `ef_construction` | 200 | Beam width during index building |
|
| 37 |
+
| `ef_search` | 50 | Beam width during query |
|
| 38 |
+
| `max_elements` | 10000 | Pre-allocated index size |
|
| 39 |
+
|
| 40 |
+
## Usage
|
| 41 |
+
|
| 42 |
+
```python
|
| 43 |
+
from citadel_vector import VectorStore
|
| 44 |
+
|
| 45 |
+
store = VectorStore(collection="docs", dim=384)
|
| 46 |
+
store.add(vectors=embeddings, metadata=metadata)
|
| 47 |
+
results = store.search(query_vector, k=10)
|
| 48 |
+
```
|
| 49 |
+
|
| 50 |
+
## Architecture
|
| 51 |
+
|
| 52 |
+
```
|
| 53 |
+
Query Vector
|
| 54 |
+
|
|
| 55 |
+
v
|
| 56 |
+
[Top Layer] -- sparse graph, long-range connections
|
| 57 |
+
|
|
| 58 |
+
v
|
| 59 |
+
[Layer N-1] -- denser graph
|
| 60 |
+
|
|
| 61 |
+
v
|
| 62 |
+
[Layer 0] -- full graph, all nodes, local connections
|
| 63 |
+
|
|
| 64 |
+
v
|
| 65 |
+
Top-K Results (sorted by cosine similarity)
|
| 66 |
+
```
|
| 67 |
+
|
| 68 |
+
## Part of Citadel
|
| 69 |
+
|
| 70 |
+
This vector engine is one of 6 independently installable packages in the Citadel AI Operations Platform:
|
| 71 |
+
|
| 72 |
+
- **citadel-gateway** — LLM proxy with routing, caching, circuit breakers
|
| 73 |
+
- **citadel-vector** — This package (HNSW vector search)
|
| 74 |
+
- **citadel-agents** — ReAct agent runtime with tool registry
|
| 75 |
+
- **citadel-ingest** — Document parsing and chunking pipeline
|
| 76 |
+
- **citadel-trace** — LLM observability and cost tracking
|
| 77 |
+
- **citadel-dashboard** — Real-time operations UI
|
| 78 |
+
|
| 79 |
+
[GitHub Repository](https://github.com/dbhavery/citadel) | [Author](https://github.com/dbhavery)
|