week13 / README.md
yiqian6999's picture
initial commit
fb1c102

A newer version of the Gradio SDK is available: 6.14.0

Upgrade
metadata
title: Week13
emoji: 🐨
colorFrom: red
colorTo: red
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
short_description: AIPI510 week13 assignment

Yelp Reviews Semantic Search Engine

A semantic search application that allows users to search through Yelp business reviews using natural language queries.

How It Works

  1. Data Source: Uses the Yelp Review Full dataset from Hugging Face (650K reviews)
  2. Embeddings: Converts reviews to vector representations using sentence-transformers/all-MiniLM-L6-v2
  3. Vector Database: Stores embeddings in ChromaDB for efficient similarity search
  4. Search: Compares user queries to review embeddings to find semantically similar matches

Usage

Simply enter a natural language query describing what you're looking for:

Example Queries:

  • "great food and friendly service"
  • "romantic atmosphere perfect for date night"
  • "fast service and good prices"
  • "authentic Italian cuisine"
  • "disappointed with the quality"

The app will return the most relevant reviews based on semantic similarity.

Technical Stack

  • Python 3.10+
  • Gradio: Web interface
  • Sentence Transformers: Text embeddings
  • ChromaDB: Vector database
  • Hugging Face Datasets: Data source

Local Setup

# Install dependencies
pip install -r requirements.txt

# Run the app
python app.py

Dataset

This app uses a sample of reviews from the Yelp Review Full dataset, which contains 650,000 reviews from Yelp with star ratings from 1-5. The reviews cover various types of businesses including restaurants, shops, and services.

Project Info

Course: AIPI 510 - Data Sourcing for Analytics
Institution: Duke University
Week: 13 - Semantic Search