File size: 1,325 Bytes
4416e3b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Retail Product Knowledge Assistant (RAG Model)

This project builds a Retrieval-Augmented Generation (RAG) model using retail product data (Kindle reviews and details). It uses a vector database to store product information and an LLM to answer questions naturally.

## Tech Stack
- **LLM**: Google Gemini (via `langchain-google-genai`)
- **Embeddings**: HuggingFace (`all-MiniLM-L6-v2`)
- **Vector Store**: ChromaDB
- **Framework**: LangChain

## Setup
1. **API Key**: Add your Google Gemini API Key to the `.env` file:
   ```env

   GOOGLE_API_KEY=your_actual_key_here

   ```
2. **Build Knowledge Base**:
   Run the following command to process the data and build the vector database:
   ```bash

   python main.py

   ```
   (It will automatically detect if the database needs to be built).

## Usage
Run `main.py` and ask questions about the products, such as:
- "Which Kindle model has the best resolution?"
- "What do users say about the battery life of the Paperwhite?"
- "Is the Kindle Voyage worth the extra money?"

## Files
- `7817_1.csv`: Raw product data.
- `preprocess.py`: Cleans and formats data into JSON.
- `rag_model.py`: Contains the logic for the RAG pipeline.
- `main.py`: Interactive CLI for user queries.
- `chroma_db/`: Directory where the vector store is persisted.