Sentence Similarity
sentence-transformers
Safetensors
English
apex_retriever
rag
retrieval
semantic-search
faiss
bm25
reranker
cross-encoder
flan-t5
hybrid-search
dense-retrieval
ai
llm
search
question-answering
Instructions to use QuantaSparkLabs/ApexRetriever-Pro with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use QuantaSparkLabs/ApexRetriever-Pro with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("QuantaSparkLabs/ApexRetriever-Pro") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| language: | |
| - en | |
| tags: | |
| - rag | |
| - retrieval | |
| - semantic-search | |
| - faiss | |
| - bm25 | |
| - reranker | |
| - cross-encoder | |
| - sentence-transformers | |
| - flan-t5 | |
| - hybrid-search | |
| - dense-retrieval | |
| - ai | |
| - llm | |
| - search | |
| - question-answering | |
| pipeline_tag: sentence-similarity | |
| library_name: sentence-transformers | |
| # ApexRetriever-Pro | |
| A powerful 5-stage hybrid retrieval system combining sparse retrieval, dense semantic search, diversity optimization, reranking, and generative refinement. | |
| Built for: | |
| - RAG pipelines | |
| - AI agents | |
| - semantic search | |
| - document QA | |
| - memory systems | |
| - knowledge retrieval | |
| - research assistants | |
| --- | |
| # Architecture | |
| ApexRetriever-Pro uses a multi-stage retrieval pipeline: | |
| ## Stage β β BM25 Sparse Retrieval | |
| Fast keyword-based retrieval using BM25. | |
| ## Stage β‘ β Dense Semantic Retrieval | |
| Semantic vector search powered by: | |
| - `BAAI/bge-small-en-v1.5` | |
| Uses FAISS for high-speed similarity search. | |
| ## Stage β’ β MMR Diversity Filtering | |
| Maximal Marginal Relevance (MMR) improves result diversity and reduces duplicate-style retrieval. | |
| ## Stage β£ β CrossEncoder Reranking | |
| High-quality neural reranking using: | |
| - `cross-encoder/ms-marco-MiniLM-L-6-v2` | |
| Improves relevance precision significantly. | |
| ## Stage β€ β FLAN-T5 Refinement | |
| Final answer refinement using: | |
| - `google/flan-t5-base` | |
| Generates concise refined outputs from retrieved context. | |
| --- | |
| # Features | |
| - Hybrid sparse+dense retrieval | |
| - FAISS accelerated search | |
| - MMR diversity optimization | |
| - Neural reranking | |
| - Generative refinement | |
| - GPU acceleration | |
| - Plug-and-play pipeline | |
| - Lightweight deployment | |
| - Kaggle compatible | |
| - HuggingFace compatible | |
| --- | |
| # Repository Structure | |
| ```text | |
| ApexRetriever-Pro/ | |
| β | |
| βββ bi_encoder/ | |
| βββ reranker/ | |
| βββ flan_t5/ | |
| βββ pipeline.py | |
| βββ README.md | |
| ```` | |
| --- | |
| # Installation | |
| ```bash | |
| pip install -U \ | |
| sentence-transformers \ | |
| transformers \ | |
| faiss-cpu \ | |
| rank-bm25 \ | |
| torch | |
| ``` | |
| --- | |
| # Quick Start | |
| ```python | |
| from pipeline import ApexRetrieverPro | |
| retriever = ApexRetrieverPro(model_dir=".") | |
| # Example documents | |
| docs = [ | |
| "Python was created by Guido van Rossum.", | |
| "Paris is the capital of France.", | |
| "Transformers power modern LLMs." | |
| ] | |
| # Build index | |
| retriever.index_documents(docs) | |
| # Retrieve | |
| results = retriever.retrieve( | |
| "Who created Python?", | |
| top_k=3 | |
| ) | |
| print(results) | |
| ``` | |
| --- | |
| # Example Output | |
| ```python | |
| [ | |
| 'Python was created by Guido van Rossum.' | |
| ] | |
| ``` | |
| --- | |
| # Use Cases | |
| * Retrieval-Augmented Generation (RAG) | |
| * AI chatbots | |
| * Local document search | |
| * Agent memory systems | |
| * Knowledge bases | |
| * Research copilots | |
| * Semantic indexing | |
| * QA systems | |
| * Enterprise search | |
| --- | |
| # Performance Notes | |
| Recommended: | |
| * CUDA GPU | |
| * 16GB+ RAM | |
| * Python 3.10+ | |
| Works on: | |
| * Kaggle | |
| * Colab | |
| * Local GPU systems | |
| * Linux | |
| * Windows | |
| --- | |
| # Model Components | |
| | Component | Model | | |
| | ------------- | ------------------------------------ | | |
| | Dense Encoder | BAAI/bge-small-en-v1.5 | | |
| | Reranker | cross-encoder/ms-marco-MiniLM-L-6-v2 | | |
| | Refiner | google/flan-t5-base | | |
| | Vector Engine | FAISS | | |
| | Sparse Search | BM25 | | |
| --- | |
| # License | |
| Apache 2.0 | |
| --- | |
| >QuantaSparkLabs | |