File size: 1,325 Bytes
4416e3b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | # Retail Product Knowledge Assistant (RAG Model)
This project builds a Retrieval-Augmented Generation (RAG) model using retail product data (Kindle reviews and details). It uses a vector database to store product information and an LLM to answer questions naturally.
## Tech Stack
- **LLM**: Google Gemini (via `langchain-google-genai`)
- **Embeddings**: HuggingFace (`all-MiniLM-L6-v2`)
- **Vector Store**: ChromaDB
- **Framework**: LangChain
## Setup
1. **API Key**: Add your Google Gemini API Key to the `.env` file:
```env
GOOGLE_API_KEY=your_actual_key_here
```
2. **Build Knowledge Base**:
Run the following command to process the data and build the vector database:
```bash
python main.py
```
(It will automatically detect if the database needs to be built).
## Usage
Run `main.py` and ask questions about the products, such as:
- "Which Kindle model has the best resolution?"
- "What do users say about the battery life of the Paperwhite?"
- "Is the Kindle Voyage worth the extra money?"
## Files
- `7817_1.csv`: Raw product data.
- `preprocess.py`: Cleans and formats data into JSON.
- `rag_model.py`: Contains the logic for the RAG pipeline.
- `main.py`: Interactive CLI for user queries.
- `chroma_db/`: Directory where the vector store is persisted.
|