Initial Commit
Browse files- README +115 -0
- app.py +66 -0
- build_index.py +33 -0
- catalog.csv +21 -0
- index.html +135 -0
- requirements.txt +5 -0
- search.py +31 -0
README
ADDED
|
@@ -0,0 +1,115 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# 📚 Semantic Library Search
|
| 2 |
+
|
| 3 |
+
An AI-powered library search engine that finds books by meaning, not just keywords. Built as a sample project for a AI/ML.
|
| 4 |
+
|
| 5 |
+
## What Is Semantic Search?
|
| 6 |
+
|
| 7 |
+
Traditional library search looks for exact keyword matches. If you search for "books about the cosmos" you might miss books that use the word "universe" or "astronomy" instead.
|
| 8 |
+
|
| 9 |
+
Semantic search understands *meaning*. It knows that "cosmos", "universe", "space", and "astronomy" are all related concepts — and returns relevant results even when the exact words don't match.
|
| 10 |
+
|
| 11 |
+
## How It Works
|
| 12 |
+
|
| 13 |
+
1. **Catalog Loading** — Book metadata is loaded from a CSV catalog file
|
| 14 |
+
2. **Embedding Generation** — Each book description is converted into a mathematical "meaning fingerprint" using a Sentence Transformer AI model
|
| 15 |
+
3. **Vector Indexing** — These fingerprints are stored in a FAISS index for fast similarity searching
|
| 16 |
+
4. **Query Processing** — When a user searches, their query is converted into a fingerprint and compared against all books in the index
|
| 17 |
+
5. **Results Returned** — The closest matching books are returned ranked by semantic similarity
|
| 18 |
+
|
| 19 |
+
## Technologies Used
|
| 20 |
+
|
| 21 |
+
- **Python 3.14** — Core programming language
|
| 22 |
+
- **Sentence Transformers** — AI model for generating semantic embeddings (all-MiniLM-L6-v2)
|
| 23 |
+
- **FAISS** — Facebook AI Similarity Search for fast vector search
|
| 24 |
+
- **FastAPI** — Modern Python web framework for the search API
|
| 25 |
+
- **Uvicorn** — ASGI server for running the API
|
| 26 |
+
- **Pandas** — Data manipulation and catalog management
|
| 27 |
+
- **HTML/CSS/JavaScript** — Frontend search interface
|
| 28 |
+
|
| 29 |
+
## Project Structure
|
| 30 |
+
|
| 31 |
+
```
|
| 32 |
+
semantic-library-search/
|
| 33 |
+
├── catalog.csv # Library catalog with 20 books
|
| 34 |
+
├── build_index.py # Builds the AI search index
|
| 35 |
+
├── search.py # Command line search interface
|
| 36 |
+
├── app.py # FastAPI search API
|
| 37 |
+
├── index.html # Web search interface
|
| 38 |
+
├── library.index # Generated FAISS vector index
|
| 39 |
+
├── catalog_processed.csv # Processed catalog data
|
| 40 |
+
└── embeddings.pkl # Saved book embeddings
|
| 41 |
+
```
|
| 42 |
+
|
| 43 |
+
## Getting Started
|
| 44 |
+
|
| 45 |
+
### Prerequisites
|
| 46 |
+
- Python 3.9 or higher
|
| 47 |
+
- pip package manager
|
| 48 |
+
|
| 49 |
+
### Installation
|
| 50 |
+
|
| 51 |
+
1. Clone the repository:
|
| 52 |
+
```
|
| 53 |
+
git clone https://github.com/angelacolmen/semantic-library-search.git
|
| 54 |
+
cd semantic-library-search
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
2. Create and activate a virtual environment:
|
| 58 |
+
```
|
| 59 |
+
python -m venv venv
|
| 60 |
+
venv\Scripts\activate # Windows
|
| 61 |
+
source venv/bin/activate # Mac/Linux
|
| 62 |
+
```
|
| 63 |
+
|
| 64 |
+
3. Install dependencies:
|
| 65 |
+
```
|
| 66 |
+
pip install sentence-transformers faiss-cpu pandas fastapi uvicorn python-multipart
|
| 67 |
+
```
|
| 68 |
+
|
| 69 |
+
4. Build the search index:
|
| 70 |
+
```
|
| 71 |
+
python build_index.py
|
| 72 |
+
```
|
| 73 |
+
|
| 74 |
+
5. Start the API:
|
| 75 |
+
```
|
| 76 |
+
uvicorn app:app --reload
|
| 77 |
+
```
|
| 78 |
+
|
| 79 |
+
6. Open your browser and go to:
|
| 80 |
+
```
|
| 81 |
+
http://127.0.0.1:8000
|
| 82 |
+
```
|
| 83 |
+
|
| 84 |
+
## Example Searches
|
| 85 |
+
|
| 86 |
+
Try these searches to see semantic search in action:
|
| 87 |
+
|
| 88 |
+
- `books about space and the universe`
|
| 89 |
+
- `stories about race and justice in America`
|
| 90 |
+
- `women who made a difference in science`
|
| 91 |
+
- `how governments control people`
|
| 92 |
+
- `survival against the odds`
|
| 93 |
+
|
| 94 |
+
Notice how results appear even when the exact search words don't appear in the book titles or descriptions!
|
| 95 |
+
|
| 96 |
+
## Library Science Applications
|
| 97 |
+
|
| 98 |
+
This project demonstrates several real world library applications:
|
| 99 |
+
|
| 100 |
+
- **Reference Services** — Patrons can describe their research need in plain language and receive relevant resource recommendations
|
| 101 |
+
- **Collection Development** — Identify gaps in a collection by searching for topics and seeing what's missing
|
| 102 |
+
- **Catalog Enhancement** — Improve discoverability of items that may be poorly described in traditional catalog records
|
| 103 |
+
- **Accessibility** — Helps patrons who don't know the exact terminology used in library classification systems
|
| 104 |
+
|
| 105 |
+
## Future Enhancements
|
| 106 |
+
|
| 107 |
+
- Connect to a live library catalog via API (e.g. WorldCat, Open Library)
|
| 108 |
+
- Add Library of Congress Subject Heading suggestions
|
| 109 |
+
- Implement user feedback to improve search results over time
|
| 110 |
+
- Scale to larger collections using cloud based vector databases
|
| 111 |
+
- Add multilingual search support
|
| 112 |
+
|
| 113 |
+
## Author
|
| 114 |
+
|
| 115 |
+
Built by Angela Colmenares as a sample project for AI/ML.
|
app.py
ADDED
|
@@ -0,0 +1,66 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import gradio as gr
|
| 2 |
+
from sentence_transformers import SentenceTransformer
|
| 3 |
+
import faiss
|
| 4 |
+
import numpy as np
|
| 5 |
+
import pandas as pd
|
| 6 |
+
|
| 7 |
+
# Load everything
|
| 8 |
+
print("Loading search engine...")
|
| 9 |
+
model = SentenceTransformer("all-MiniLM-L6-v2")
|
| 10 |
+
|
| 11 |
+
# Build index on startup
|
| 12 |
+
import subprocess
|
| 13 |
+
subprocess.run(["python", "build_index.py"])
|
| 14 |
+
|
| 15 |
+
index = faiss.read_index("library.index")
|
| 16 |
+
df = pd.read_csv("catalog_processed.csv")
|
| 17 |
+
print("Ready!")
|
| 18 |
+
|
| 19 |
+
def search(query, num_results=3):
|
| 20 |
+
if not query:
|
| 21 |
+
return "Please enter a search query."
|
| 22 |
+
|
| 23 |
+
query_embedding = model.encode([query])
|
| 24 |
+
distances, indices = index.search(np.array(query_embedding), num_results)
|
| 25 |
+
|
| 26 |
+
results = ""
|
| 27 |
+
for i, idx in enumerate(indices[0]):
|
| 28 |
+
book = df.iloc[idx]
|
| 29 |
+
results += f"### {i+1}. {book['title']}\n"
|
| 30 |
+
results += f"**Author:** {book['author']}\n"
|
| 31 |
+
results += f"**Subject:** {book['subject']}\n"
|
| 32 |
+
results += f"{book['description']}\n\n"
|
| 33 |
+
|
| 34 |
+
return results
|
| 35 |
+
|
| 36 |
+
# Create Gradio interface
|
| 37 |
+
demo = gr.Interface(
|
| 38 |
+
fn=search,
|
| 39 |
+
inputs=[
|
| 40 |
+
gr.Textbox(
|
| 41 |
+
label="Search Query",
|
| 42 |
+
placeholder="e.g. books about space and the universe...",
|
| 43 |
+
lines=2
|
| 44 |
+
),
|
| 45 |
+
gr.Slider(
|
| 46 |
+
minimum=1,
|
| 47 |
+
maximum=5,
|
| 48 |
+
value=3,
|
| 49 |
+
step=1,
|
| 50 |
+
label="Number of Results"
|
| 51 |
+
)
|
| 52 |
+
],
|
| 53 |
+
outputs=gr.Markdown(label="Search Results"),
|
| 54 |
+
title="📚 Semantic Library Search",
|
| 55 |
+
description="Search by meaning, not just keywords. Try searching for 'books about space and the universe' or 'stories about race and justice in America'.",
|
| 56 |
+
examples=[
|
| 57 |
+
["books about space and the universe", 3],
|
| 58 |
+
["stories about race and justice in America", 3],
|
| 59 |
+
["women who made a difference in science", 3],
|
| 60 |
+
["how governments control people", 3],
|
| 61 |
+
["survival against the odds", 3]
|
| 62 |
+
]
|
| 63 |
+
)
|
| 64 |
+
|
| 65 |
+
if __name__ == "__main__":
|
| 66 |
+
demo.launch()
|
build_index.py
ADDED
|
@@ -0,0 +1,33 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import pandas as pd
|
| 2 |
+
from sentence_transformers import SentenceTransformer
|
| 3 |
+
import faiss
|
| 4 |
+
import numpy as np
|
| 5 |
+
import pickle
|
| 6 |
+
|
| 7 |
+
# Step 1: Load your catalog
|
| 8 |
+
print("Loading catalog...")
|
| 9 |
+
df = pd.read_csv("catalog.csv")
|
| 10 |
+
|
| 11 |
+
# Step 2: Combine the important fields into one sentence per book
|
| 12 |
+
df["combined"] = df["title"] + " by " + df["author"] + ". " + df["description"]
|
| 13 |
+
|
| 14 |
+
# Step 3: Load the AI model
|
| 15 |
+
print("Loading AI model (this may take a minute first time)...")
|
| 16 |
+
model = SentenceTransformer("all-MiniLM-L6-v2")
|
| 17 |
+
|
| 18 |
+
# Step 4: Turn each book description into a vector (list of numbers)
|
| 19 |
+
print("Creating embeddings...")
|
| 20 |
+
embeddings = model.encode(df["combined"].tolist())
|
| 21 |
+
|
| 22 |
+
# Step 5: Build the search index
|
| 23 |
+
print("Building search index...")
|
| 24 |
+
index = faiss.IndexFlatL2(embeddings.shape[1])
|
| 25 |
+
index.add(np.array(embeddings))
|
| 26 |
+
|
| 27 |
+
# Step 6: Save everything for later
|
| 28 |
+
faiss.write_index(index, "library.index")
|
| 29 |
+
df.to_csv("catalog_processed.csv", index=False)
|
| 30 |
+
with open("embeddings.pkl", "wb") as f:
|
| 31 |
+
pickle.dump(embeddings, f)
|
| 32 |
+
|
| 33 |
+
print("Done! Your library search index is ready.")
|
catalog.csv
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
id,title,author,subject,description
|
| 2 |
+
1,The Great Gatsby,F. Scott Fitzgerald,Fiction,A story about wealth and the American dream in the 1920s
|
| 3 |
+
2,A Brief History of Time,Stephen Hawking,Science,An introduction to cosmology and the nature of the universe
|
| 4 |
+
3,To Kill a Mockingbird,Harper Lee,Fiction,A lawyer defends a Black man accused of a crime in the American South
|
| 5 |
+
4,The Selfish Gene,Richard Dawkins,Science,An exploration of evolution from the perspective of genes
|
| 6 |
+
5,Sapiens,Yuval Noah Harari,History,A sweeping history of humankind from prehistoric times to the present
|
| 7 |
+
6,Cosmos,Carl Sagan,Science,A journey through the universe exploring astronomy and the nature of space
|
| 8 |
+
7,The Immortal Life of Henrietta Lacks,Rebecca Skloot,Science,The story of a Black woman whose cancer cells were taken without consent and used for medical research
|
| 9 |
+
8,Just Mercy,Bryan Stevenson,Law,A lawyer fights for wrongly condemned prisoners on death row in America
|
| 10 |
+
9,The Origin of Species,Charles Darwin,Science,The foundational text of evolutionary biology explaining natural selection
|
| 11 |
+
10,Guns Germs and Steel,Jared Diamond,History,An explanation of why some civilizations came to dominate others throughout history
|
| 12 |
+
11,The Color of Law,Richard Rothstein,History,How the American government segregated the country through housing policies
|
| 13 |
+
12,Hidden Figures,Margot Lee Shetterly,History,The story of Black female mathematicians who worked at NASA during the space race
|
| 14 |
+
13,The Martian,Andy Weir,Fiction,An astronaut is stranded alone on Mars and must use science to survive
|
| 15 |
+
14,Astrophysics for People in a Hurry,Neil deGrasse Tyson,Science,A quick guide to the biggest ideas in astrophysics and the cosmos
|
| 16 |
+
15,The New Jim Crow,Michelle Alexander,Law,How mass incarceration functions as a system of racial control in America
|
| 17 |
+
16,A Short History of Nearly Everything,Bill Bryson,Science,A journey through the history of science and how humans came to understand the world
|
| 18 |
+
17,The Warmth of Other Suns,Isabel Wilkerson,History,The story of the Great Migration of Black Americans from the South to the North
|
| 19 |
+
18,Contact,Carl Sagan,Fiction,A scientist receives a mysterious signal from deep space and must decide how humanity should respond
|
| 20 |
+
19,The Sixth Extinction,Elizabeth Kolbert,Science,An investigation into how human activity is causing a mass extinction of species on Earth
|
| 21 |
+
20,Between the World and Me,Ta-Nehisi Coates,History,A father's letter to his son about the history and reality of being Black in America
|
index.html
ADDED
|
@@ -0,0 +1,135 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<!DOCTYPE html>
|
| 2 |
+
<html lang="en">
|
| 3 |
+
<head>
|
| 4 |
+
<meta charset="UTF-8">
|
| 5 |
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
| 6 |
+
<title>Semantic Library Search</title>
|
| 7 |
+
<style>
|
| 8 |
+
body {
|
| 9 |
+
font-family: Arial, sans-serif;
|
| 10 |
+
max-width: 800px;
|
| 11 |
+
margin: 40px auto;
|
| 12 |
+
padding: 20px;
|
| 13 |
+
background-color: #f5f5f5;
|
| 14 |
+
}
|
| 15 |
+
h1 {
|
| 16 |
+
color: #2c3e50;
|
| 17 |
+
text-align: center;
|
| 18 |
+
}
|
| 19 |
+
p.subtitle {
|
| 20 |
+
text-align: center;
|
| 21 |
+
color: #7f8c8d;
|
| 22 |
+
margin-bottom: 30px;
|
| 23 |
+
}
|
| 24 |
+
.search-box {
|
| 25 |
+
display: flex;
|
| 26 |
+
gap: 10px;
|
| 27 |
+
margin-bottom: 30px;
|
| 28 |
+
}
|
| 29 |
+
input {
|
| 30 |
+
flex: 1;
|
| 31 |
+
padding: 12px;
|
| 32 |
+
font-size: 16px;
|
| 33 |
+
border: 2px solid #bdc3c7;
|
| 34 |
+
border-radius: 6px;
|
| 35 |
+
}
|
| 36 |
+
button {
|
| 37 |
+
padding: 12px 24px;
|
| 38 |
+
background-color: #2980b9;
|
| 39 |
+
color: white;
|
| 40 |
+
border: none;
|
| 41 |
+
border-radius: 6px;
|
| 42 |
+
font-size: 16px;
|
| 43 |
+
cursor: pointer;
|
| 44 |
+
}
|
| 45 |
+
button:hover {
|
| 46 |
+
background-color: #2471a3;
|
| 47 |
+
}
|
| 48 |
+
.result {
|
| 49 |
+
background: white;
|
| 50 |
+
padding: 20px;
|
| 51 |
+
margin-bottom: 15px;
|
| 52 |
+
border-radius: 8px;
|
| 53 |
+
border-left: 5px solid #2980b9;
|
| 54 |
+
box-shadow: 0 2px 4px rgba(0,0,0,0.1);
|
| 55 |
+
}
|
| 56 |
+
.result h3 {
|
| 57 |
+
margin: 0 0 5px 0;
|
| 58 |
+
color: #2c3e50;
|
| 59 |
+
}
|
| 60 |
+
.result .author {
|
| 61 |
+
color: #7f8c8d;
|
| 62 |
+
font-style: italic;
|
| 63 |
+
margin-bottom: 8px;
|
| 64 |
+
}
|
| 65 |
+
.result .subject {
|
| 66 |
+
display: inline-block;
|
| 67 |
+
background: #eaf4fb;
|
| 68 |
+
color: #2980b9;
|
| 69 |
+
padding: 3px 10px;
|
| 70 |
+
border-radius: 12px;
|
| 71 |
+
font-size: 13px;
|
| 72 |
+
margin-bottom: 8px;
|
| 73 |
+
}
|
| 74 |
+
.result p {
|
| 75 |
+
color: #555;
|
| 76 |
+
margin: 0;
|
| 77 |
+
}
|
| 78 |
+
#status {
|
| 79 |
+
text-align: center;
|
| 80 |
+
color: #7f8c8d;
|
| 81 |
+
font-style: italic;
|
| 82 |
+
}
|
| 83 |
+
</style>
|
| 84 |
+
</head>
|
| 85 |
+
<body>
|
| 86 |
+
<h1>📚 Semantic Library Search</h1>
|
| 87 |
+
<p class="subtitle">Search by meaning, not just keywords</p>
|
| 88 |
+
|
| 89 |
+
<div class="search-box">
|
| 90 |
+
<input type="text" id="query" placeholder="e.g. books about space and the universe..." />
|
| 91 |
+
<button onclick="search()">Search</button>
|
| 92 |
+
</div>
|
| 93 |
+
|
| 94 |
+
<div id="status"></div>
|
| 95 |
+
<div id="results"></div>
|
| 96 |
+
|
| 97 |
+
<script>
|
| 98 |
+
document.getElementById('query').addEventListener('keypress', function(e) {
|
| 99 |
+
if (e.key === 'Enter') search();
|
| 100 |
+
});
|
| 101 |
+
|
| 102 |
+
async function search() {
|
| 103 |
+
const query = document.getElementById('query').value;
|
| 104 |
+
if (!query) return;
|
| 105 |
+
|
| 106 |
+
document.getElementById('status').textContent = 'Searching...';
|
| 107 |
+
document.getElementById('results').innerHTML = '';
|
| 108 |
+
|
| 109 |
+
try {
|
| 110 |
+
const response = await fetch('http://127.0.0.1:8000/search', {
|
| 111 |
+
method: 'POST',
|
| 112 |
+
headers: { 'Content-Type': 'application/json' },
|
| 113 |
+
body: JSON.stringify({ query: query, num_results: 3 })
|
| 114 |
+
});
|
| 115 |
+
|
| 116 |
+
const data = await response.json();
|
| 117 |
+
document.getElementById('status').textContent = '';
|
| 118 |
+
|
| 119 |
+
data.results.forEach(book => {
|
| 120 |
+
document.getElementById('results').innerHTML += `
|
| 121 |
+
<div class="result">
|
| 122 |
+
<h3>${book.title}</h3>
|
| 123 |
+
<div class="author">by ${book.author}</div>
|
| 124 |
+
<span class="subject">${book.subject}</span>
|
| 125 |
+
<p>${book.description}</p>
|
| 126 |
+
</div>
|
| 127 |
+
`;
|
| 128 |
+
});
|
| 129 |
+
} catch (error) {
|
| 130 |
+
document.getElementById('status').textContent = 'Error connecting to search engine. Make sure the API is running!';
|
| 131 |
+
}
|
| 132 |
+
}
|
| 133 |
+
</script>
|
| 134 |
+
</body>
|
| 135 |
+
</html>
|
requirements.txt
ADDED
|
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
sentence-transformers
|
| 2 |
+
faiss-cpu
|
| 3 |
+
pandas
|
| 4 |
+
gradio
|
| 5 |
+
numpy
|
search.py
ADDED
|
@@ -0,0 +1,31 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import pandas as pd
|
| 2 |
+
from sentence_transformers import SentenceTransformer
|
| 3 |
+
import faiss
|
| 4 |
+
import numpy as np
|
| 5 |
+
|
| 6 |
+
# Load everything we built
|
| 7 |
+
print("Loading search engine...")
|
| 8 |
+
model = SentenceTransformer("all-MiniLM-L6-v2")
|
| 9 |
+
index = faiss.read_index("library.index")
|
| 10 |
+
df = pd.read_csv("catalog_processed.csv")
|
| 11 |
+
|
| 12 |
+
print("Ready! Type your search question below.")
|
| 13 |
+
print("Type 'quit' to exit\n")
|
| 14 |
+
|
| 15 |
+
# Search loop
|
| 16 |
+
while True:
|
| 17 |
+
query = input("Search: ")
|
| 18 |
+
if query.lower() == "quit":
|
| 19 |
+
break
|
| 20 |
+
|
| 21 |
+
# Turn your search query into a meaning fingerprint
|
| 22 |
+
query_embedding = model.encode([query])
|
| 23 |
+
|
| 24 |
+
# Find the 3 most similar books
|
| 25 |
+
distances, indices = index.search(np.array(query_embedding), 3)
|
| 26 |
+
|
| 27 |
+
print("\nTop results:")
|
| 28 |
+
for i, idx in enumerate(indices[0]):
|
| 29 |
+
book = df.iloc[idx]
|
| 30 |
+
print(f"{i+1}. {book['title']} by {book['author']}")
|
| 31 |
+
print(f" {book['description']}\n")
|