Spaces:
Sleeping
Sleeping
| title: Simple Search Engine | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| # Simple Search Engine π | |
| An intelligent document search engine powered by Sentence Transformers (SBERT) and FastAPI. | |
| ## Features | |
| - **Semantic Search**: Uses the `all-MiniLM-L6-v2` model for understanding query context | |
| - **Fast & Efficient**: Built with FastAPI for high performance | |
| - **Beautiful UI**: Clean, modern interface with gradient design | |
| - **Real-time Results**: Instant search results with similarity scores | |
| ## How It Works | |
| 1. Documents are chunked into smaller segments (3 sentences each) | |
| 2. Each chunk is encoded using SBERT into vector embeddings | |
| 3. User queries are encoded and compared using cosine similarity | |
| 4. Top 5 most relevant chunks are returned with similarity scores | |
| ## Technology Stack | |
| - **Backend**: FastAPI | |
| - **ML Model**: Sentence Transformers (all-MiniLM-L6-v2) | |
| - **NLP**: NLTK for sentence tokenization | |
| - **Similarity**: Scikit-learn for cosine similarity computation | |
| ## Usage | |
| Simply enter your search query in the search box and press Enter or click the Search button. The engine will return the top 5 most relevant document chunks with their similarity scores. | |
| ## API Endpoints | |
| - `GET /` - Web interface | |
| - `POST /search` - Search endpoint (accepts JSON with `query` field) | |
| - `GET /health` - Health check endpoint | |
| ## Example Queries | |
| - "machine learning AI" | |
| - "cloud infrastructure AWS" | |
| - "financial reports revenue" | |
| - "marketing SEO strategies" | |