File size: 2,227 Bytes
b782836
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
---
license: apache-2.0
language:
- en
tags:
- rag
- retrieval
- semantic-search
- faiss
- bm25
- cross-encoder
- sentence-transformers
- hybrid-search
- dense-retrieval
- ai
- search
pipeline_tag: sentence-similarity
library_name: sentence-transformers
---
# ApexRetriever

A lightweight hybrid retrieval system designed for fast semantic search and RAG pipelines.

Built for:
- semantic search
- lightweight RAG
- AI assistants
- retrieval systems
- local document QA

---

# Architecture

## Stage β‘  β€” BM25 Sparse Retrieval
Keyword-based retrieval for fast lexical matching.

## Stage β‘‘ β€” Dense Semantic Search
Powered by:

- `BAAI/bge-small-en-v1.5`

Uses FAISS vector indexing.

## Stage β‘’ β€” CrossEncoder Reranking
Final neural reranking using:

- `cross-encoder/ms-marco-MiniLM-L-6-v2`

---

# Features

- Hybrid retrieval
- Fast indexing
- Dense semantic search
- Neural reranking
- Lightweight deployment
- GPU acceleration
- FAISS support
- Easy integration

---

# Repository Structure

```text
ApexRetriever/
β”‚
β”œβ”€β”€ bi_encoder/
β”œβ”€β”€ reranker/
β”œβ”€β”€ pipeline.py
└── README.md
````

---

# Installation

```bash
pip install -U \
    sentence-transformers \
    transformers \
    faiss-cpu \
    rank-bm25 \
    torch
```

---

# Quick Start

```python
from pipeline import ApexRetriever

retriever = ApexRetriever(model_dir=".")

# Example documents

docs = [
    "Python was created by Guido van Rossum.",
    "Paris is the capital of France."
]

retriever.index_documents(docs)

results = retriever.retrieve(
    "Who created Python?",
    top_k=3
)

print(results)
```

---

# Use Cases

* RAG systems
* Semantic search
* AI chatbots
* Knowledge retrieval
* Local search engines
* Memory systems

---

# Performance

Recommended:

* CUDA GPU
* 8GB+ RAM
* Python 3.10+

---

# Components

| Component     | Model                                |
| ------------- | ------------------------------------ |
| Dense Encoder | BAAI/bge-small-en-v1.5               |
| Reranker      | cross-encoder/ms-marco-MiniLM-L-6-v2 |
| Vector Engine | FAISS                                |
| Sparse Search | BM25                                 |

---

# License

Apache 2.0

---
> QuantaSparkLabs