# Retail Product Knowledge Assistant (RAG Model) This project builds a Retrieval-Augmented Generation (RAG) model using retail product data (Kindle reviews and details). It uses a vector database to store product information and an LLM to answer questions naturally. ## Tech Stack - **LLM**: Google Gemini (via `langchain-google-genai`) - **Embeddings**: HuggingFace (`all-MiniLM-L6-v2`) - **Vector Store**: ChromaDB - **Framework**: LangChain ## Setup 1. **API Key**: Add your Google Gemini API Key to the `.env` file: ```env GOOGLE_API_KEY=your_actual_key_here ``` 2. **Build Knowledge Base**: Run the following command to process the data and build the vector database: ```bash python main.py ``` (It will automatically detect if the database needs to be built). ## Usage Run `main.py` and ask questions about the products, such as: - "Which Kindle model has the best resolution?" - "What do users say about the battery life of the Paperwhite?" - "Is the Kindle Voyage worth the extra money?" ## Files - `7817_1.csv`: Raw product data. - `preprocess.py`: Cleans and formats data into JSON. - `rag_model.py`: Contains the logic for the RAG pipeline. - `main.py`: Interactive CLI for user queries. - `chroma_db/`: Directory where the vector store is persisted.