abalone_chat_application / RAG_APP_README.md
cmd0160's picture
Adding auto embedding
698ce25
# Abalone RAG Chatbot
This project implements a Retrieval-Augmented Generation (RAG) chatbot about Abalone using LangChain + OpenAI with a Streamlit frontend. It's designed to be deployed on Hugging Face Spaces.
Contents
- `app.py` - Streamlit app entrypoint
- `src/ingest.py` - Ingest files from `data/` into a persisted Chroma vectorstore
- `src/vectorstore.py` - Helpers to build/load the Chroma vectorstore and return a retriever
- `src/qa_chain.py` - Build the conversational retrieval QA chain
- `data/` - Put Abalone source files here (CSV/MD/TXT/PDF)
- `vectorstore/` - Persisted vectorstore directory (created by ingestion)
Quickstart (local)
1. Create a venv and install dependencies:
```bash
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```
2. Set your OpenAI API key:
```bash
export OPENAI_API_KEY="sk-..."
```
3. Add Abalone files into `data/` (for example `abalone.csv`).
4. Build the vectorstore:
```bash
python -m src.ingest --data-dir ./data --persist-dir ./vectorstore
```
5. Run the Streamlit app:
```bash
streamlit run app.py
```
Deploying to Hugging Face Spaces
- Add `OPENAI_API_KEY` in the Spaces secrets (Settings -> Secrets).
- Push this repository to your HF Space. HF will install `requirements.txt` and run the Streamlit app.
- On first run, click the "Ingest data" button or allow the app to rebuild the index.
Security
- Do NOT commit your OpenAI API key. Use HF Spaces Secrets for deployment.
License
- MIT