# Retail Product Knowledge Assistant (RAG Model)

This project builds a Retrieval-Augmented Generation (RAG) model using retail product data (Kindle reviews and details). It uses a vector database to store product information and an LLM to answer questions naturally.

## Tech Stack
- **LLM**: Google Gemini (via `langchain-google-genai`)
- **Embeddings**: HuggingFace (`all-MiniLM-L6-v2`)
- **Vector Store**: ChromaDB
- **Framework**: LangChain

## Setup
1. **API Key**: Add your Google Gemini API Key to the `.env` file:
   ```env
   GOOGLE_API_KEY=your_actual_key_here
   ```
2. **Build Knowledge Base**:
   Run the following command to process the data and build the vector database:
   ```bash
   python main.py
   ```
   (It will automatically detect if the database needs to be built).

## Usage
Run `main.py` and ask questions about the products, such as:
- "Which Kindle model has the best resolution?"
- "What do users say about the battery life of the Paperwhite?"
- "Is the Kindle Voyage worth the extra money?"

## Files
- `7817_1.csv`: Raw product data.
- `preprocess.py`: Cleans and formats data into JSON.
- `rag_model.py`: Contains the logic for the RAG pipeline.
- `main.py`: Interactive CLI for user queries.
- `chroma_db/`: Directory where the vector store is persisted.