---
title: Hadith Search
emoji: ๐
colorFrom: indigo
colorTo: green
sdk: docker
pinned: false
license: mit
---
# ๐ Hadith Search
**Semantic search across thousands of Prophetic traditions โ find the Hadith closest to your question by meaning, not just keywords.**
[](https://huggingface.co/spaces/NightPrince/Hadith_Search)
[](LICENSE)
[](https://python.org)
[](https://fastapi.tiangolo.com)
---
## What Is This?
A hybrid AI-powered search engine over a comprehensive corpus of Islamic Hadith (prophetic traditions). It combines neural semantic embeddings with classical BM25 and anchor-based retrieval to surface the most relevant traditions โ even when your query uses different wording than the Hadith itself.
Each result includes the full Hadith text, its chain of narration (Isnad), topic classification, and a direct source link.
---
## Demo
๐ **[Live on HuggingFace Spaces โ](https://huggingface.co/spaces/NightPrince/Hadith_Search)**
---
## How It Works
```
User Query (Arabic)
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Arabic Preprocessing โ
โ Remove tashkeel ยท Normalize letters โ
โ Unicode variant unification โ
โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Hybrid Search (3 signals) โ
โ โ
โ โ Anchor 40% โ hadith entity match โ
โ โก Semantic 35% โ neural meaning match โ
โ โข BM25 25% โ keyword precision โ
โ โ
โ Model: paraphrase-multilingual-MiniLM-L12 โ
โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
Top-K ranked Hadiths
(text ยท isnad ยท topic ยท source URL)
```
The **anchor signal** is weighted higher here (40%) because Hadith have strong named-entity anchors (narrators, topics, keywords) that are highly discriminative โ making entity-aware matching the dominant signal.
---
## Features
- **Anchor-weighted hybrid** โ prioritizes entity matching (40%) over pure semantics
- **Full Hadith metadata** โ text, Isnad chain, topic classification, source URL
- **Arabic-native** โ built for Arabic queries with proper diacritic handling
- **RTL Arabic UI** โ responsive glassmorphism design
- **Fast cold start** โ model baked into Docker image at build time
- **Cached embeddings** โ TTL-based in-memory cache for repeated queries
---
## Tech Stack
| Layer | Technology |
|---|---|
| Backend | FastAPI + Uvicorn |
| Embeddings | `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2` |
| Vector Search | FAISS (CPU) |
| Keyword Search | BM25 (`rank-bm25`) |
| Frontend | Vanilla HTML/CSS/JS โ RTL Arabic |
| Deployment | Docker on HuggingFace Spaces |
---
## Project Structure
```
โโโ app.py # FastAPI entrypoint, /api/search endpoint
โโโ hadith_mcp.py # Search orchestrator, RAG initialization
โโโ retrieval.py # Hybrid search: BM25 + semantic + anchor
โโโ hf_model.py # Thread-safe SentenceTransformer + TTL cache
โโโ utils.py # Arabic text utilities (tashkeel, normalization)
โโโ index.html # Frontend UI
โโโ assets/
โ โโโ script.js # Fetch + render result cards
โ โโโ style.css # Glassmorphism RTL design
โโโ data/
โ โโโ hadith.csv # Hadith corpus (text, isnad, title, topic, url)
โ โโโ hadith_embeddings.npy # Pre-computed embeddings
โ โโโ bm25.pkl # BM25 index
โ โโโ faiss_anchor.index # FAISS anchor index
โ โโโ anchor_dict.pkl # anchor โ hadith row indices
โ โโโ unique_anchor_texts.pkl # Ordered anchor list
โโโ Dockerfile
```
---
## API
### `POST /api/search`
```json
// Request
{ "query": "ุฅูู
ุง ุงูุฃุนู
ุงู ุจุงูููุงุช", "top_k": 5 }
// Response
{
"results": [
{
"rank": 1,
"title": "ุญุฏูุซ ุงูููุฉ",
"text": "ุนููู ุนูู
ูุฑู ุจููู ุงููุฎูุทููุงุจู ููุงูู ุณูู
ูุนูุชู ุฑูุณูููู ุงูููููู...",
"topic": "ุงูููุฉ ูุงูุฅุฎูุงุต",
"source_url": "https://..."
}
]
}
```
`top_k` accepts 1โ10.
---
## Local Setup
```bash
pip install -r requirements.txt
uvicorn app:app --host 0.0.0.0 --port 7860 --reload
# open http://localhost:7860
```
---
## Built by
**ูุญูู ุงูููุณุงูู** โ [HuggingFace](https://huggingface.co/NightPrince)
---
*Part of a series of Islamic knowledge retrieval engines. See also: [Tafsir Search](https://github.com/NightPrinceY/Tafsir_Search) ยท [Quran Semantic Retrieval](https://github.com/NightPrinceY/Quran-Semantic-Retrieval)*