File size: 3,299 Bytes

---
license: apache-2.0
language:
- en
tags:
- rag
- retrieval
- semantic-search
- faiss
- bm25
- reranker
- cross-encoder
- sentence-transformers
- flan-t5
- hybrid-search
- dense-retrieval
- ai
- llm
- search
- question-answering
pipeline_tag: sentence-similarity
library_name: sentence-transformers
---
# ApexRetriever-Pro

A powerful 5-stage hybrid retrieval system combining sparse retrieval, dense semantic search, diversity optimization, reranking, and generative refinement.

Built for:
- RAG pipelines
- AI agents
- semantic search
- document QA
- memory systems
- knowledge retrieval
- research assistants

---

# Architecture

ApexRetriever-Pro uses a multi-stage retrieval pipeline:

## Stage ① — BM25 Sparse Retrieval
Fast keyword-based retrieval using BM25.

## Stage ② — Dense Semantic Retrieval
Semantic vector search powered by:

- `BAAI/bge-small-en-v1.5`

Uses FAISS for high-speed similarity search.

## Stage ③ — MMR Diversity Filtering
Maximal Marginal Relevance (MMR) improves result diversity and reduces duplicate-style retrieval.

## Stage ④ — CrossEncoder Reranking
High-quality neural reranking using:

- `cross-encoder/ms-marco-MiniLM-L-6-v2`

Improves relevance precision significantly.

## Stage ⑤ — FLAN-T5 Refinement
Final answer refinement using:

- `google/flan-t5-base`

Generates concise refined outputs from retrieved context.

---

# Features

- Hybrid sparse+dense retrieval
- FAISS accelerated search
- MMR diversity optimization
- Neural reranking
- Generative refinement
- GPU acceleration
- Plug-and-play pipeline
- Lightweight deployment
- Kaggle compatible
- HuggingFace compatible

---

# Repository Structure

```text
ApexRetriever-Pro/
│
├── bi_encoder/
├── reranker/
├── flan_t5/
├── pipeline.py
└── README.md
````

---

# Installation

```bash
pip install -U \
    sentence-transformers \
    transformers \
    faiss-cpu \
    rank-bm25 \
    torch
```

---

# Quick Start

```python
from pipeline import ApexRetrieverPro

retriever = ApexRetrieverPro(model_dir=".")

# Example documents

docs = [
    "Python was created by Guido van Rossum.",
    "Paris is the capital of France.",
    "Transformers power modern LLMs."
]

# Build index

retriever.index_documents(docs)

# Retrieve

results = retriever.retrieve(
    "Who created Python?",
    top_k=3
)

print(results)
```

---

# Example Output

```python
[
    'Python was created by Guido van Rossum.'
]
```

---

# Use Cases

* Retrieval-Augmented Generation (RAG)
* AI chatbots
* Local document search
* Agent memory systems
* Knowledge bases
* Research copilots
* Semantic indexing
* QA systems
* Enterprise search

---

# Performance Notes

Recommended:

* CUDA GPU
* 16GB+ RAM
* Python 3.10+

Works on:

* Kaggle
* Colab
* Local GPU systems
* Linux
* Windows

---

# Model Components

| Component     | Model                                |
| ------------- | ------------------------------------ |
| Dense Encoder | BAAI/bge-small-en-v1.5               |
| Reranker      | cross-encoder/ms-marco-MiniLM-L-6-v2 |
| Refiner       | google/flan-t5-base                  |
| Vector Engine | FAISS                                |
| Sparse Search | BM25                                 |

---

# License

Apache 2.0

---
>QuantaSparkLabs