dbhavery commited on
Commit
8fcc47f
·
verified ·
1 Parent(s): 44f875b

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +79 -0
README.md ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: custom
3
+ tags:
4
+ - vector-search
5
+ - hnsw
6
+ - nearest-neighbor
7
+ - information-retrieval
8
+ - from-scratch
9
+ license: mit
10
+ ---
11
+
12
+ # HNSW Vector Engine
13
+
14
+ A zero-dependency implementation of Hierarchical Navigable Small World (HNSW) approximate nearest-neighbor search, built directly from the [Malkov & Yashunin 2018 paper](https://arxiv.org/abs/1603.09320).
15
+
16
+ Part of [Citadel](https://github.com/dbhavery/citadel), an open-source AI operations platform.
17
+
18
+ ## Why From Scratch?
19
+
20
+ Most vector search tools wrap existing libraries (FAISS, Annoy, HNSWlib). This implementation builds the full HNSW index from first principles — multi-layer graph construction, greedy search with backtracking, configurable M/ef parameters — to demonstrate deep understanding of the algorithm, not just API usage.
21
+
22
+ ## Features
23
+
24
+ - **Multi-layer graph** with probabilistic level assignment
25
+ - **Greedy beam search** with configurable ef (exploration factor)
26
+ - **Cosine similarity** distance metric
27
+ - **Batch insert and query** with REST API wrapper
28
+ - **Persistent storage** to disk
29
+ - **18 tests** covering index construction, search accuracy, edge cases
30
+
31
+ ## Parameters
32
+
33
+ | Parameter | Default | Description |
34
+ |-----------|---------|-------------|
35
+ | `M` | 16 | Max connections per node per layer |
36
+ | `ef_construction` | 200 | Beam width during index building |
37
+ | `ef_search` | 50 | Beam width during query |
38
+ | `max_elements` | 10000 | Pre-allocated index size |
39
+
40
+ ## Usage
41
+
42
+ ```python
43
+ from citadel_vector import VectorStore
44
+
45
+ store = VectorStore(collection="docs", dim=384)
46
+ store.add(vectors=embeddings, metadata=metadata)
47
+ results = store.search(query_vector, k=10)
48
+ ```
49
+
50
+ ## Architecture
51
+
52
+ ```
53
+ Query Vector
54
+ |
55
+ v
56
+ [Top Layer] -- sparse graph, long-range connections
57
+ |
58
+ v
59
+ [Layer N-1] -- denser graph
60
+ |
61
+ v
62
+ [Layer 0] -- full graph, all nodes, local connections
63
+ |
64
+ v
65
+ Top-K Results (sorted by cosine similarity)
66
+ ```
67
+
68
+ ## Part of Citadel
69
+
70
+ This vector engine is one of 6 independently installable packages in the Citadel AI Operations Platform:
71
+
72
+ - **citadel-gateway** — LLM proxy with routing, caching, circuit breakers
73
+ - **citadel-vector** — This package (HNSW vector search)
74
+ - **citadel-agents** — ReAct agent runtime with tool registry
75
+ - **citadel-ingest** — Document parsing and chunking pipeline
76
+ - **citadel-trace** — LLM observability and cost tracking
77
+ - **citadel-dashboard** — Real-time operations UI
78
+
79
+ [GitHub Repository](https://github.com/dbhavery/citadel) | [Author](https://github.com/dbhavery)