Xlordo commited on
Commit
829cfaa
Β·
verified Β·
1 Parent(s): 0cd25e6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -11
README.md CHANGED
@@ -1,14 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: SBERT Semantic Search System
3
- emoji: πŸ†
4
- colorFrom: blue
5
- colorTo: blue
6
- sdk: gradio
7
- sdk_version: 5.44.1
8
- app_file: app.py
9
- pinned: false
10
- license: cc-by-nc-2.0
11
- short_description: SBERT Semantic Search System
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
1
+ # SBERT + FAISS Semantic Search
2
+
3
+ This Hugging Face Space hosts a **semantic search system** built with:
4
+
5
+ - [Sentence-BERT (SBERT)](https://www.sbert.net/) for embeddings
6
+ - [FAISS](https://faiss.ai/) for fast vector search
7
+ - [MS MARCO v1.1 dataset](https://microsoft.github.io/msmarco/) (10,000 passages subset)
8
+ - [Gradio](https://gradio.app/) for the interactive interface
9
+
10
+ ---
11
+
12
+ ## πŸ”Ή Features
13
+ - Enter a **query** to retrieve the **Top-10 most similar passages**.
14
+ - Optionally provide **ground truth relevant passages** (one per line) to compute **IR metrics**:
15
+ - Precision@10
16
+ - Recall@10
17
+ - F1-score
18
+ - Mean Reciprocal Rank (MRR)
19
+ - Normalized Discounted Cumulative Gain (nDCG@10)
20
+
21
  ---
22
+
23
+ ## πŸ”Ή How to Use
24
+ 1. Type a query into the input box.
25
+ 2. (Optional) Paste one or more relevant passages into the second box, each on a new line.
26
+ 3. Press **Submit**.
27
+ 4. View:
28
+ - **Top-10 retrieved passages** with FAISS similarity scores
29
+ - **Evaluation metrics** if ground truth passages were provided
30
+
31
+ ---
32
+
33
+ ## πŸ”Ή Tech Stack
34
+ - **Embeddings:** `sentence-transformers/all-mpnet-base-v2`
35
+ - **Indexing:** FAISS (L2 similarity)
36
+ - **Dataset:** MS MARCO v1.1 (first 10,000 passages)
37
+ - **Interface:** Gradio
38
+
39
+ ---
40
+
41
+ ## πŸ”Ή Citation
42
+ If you use this system in research, please cite:
43
+
44
+ - [Sentence-BERT](https://arxiv.org/abs/1908.10084)
45
+ - [MS MARCO](https://microsoft.github.io/msmarco/)
46
+
47
  ---
48
 
49
+ ## πŸ”Ή Author
50
+ Built for a research project on **user-centered evaluation of semantic search systems**.