File size: 1,489 Bytes
698ce25 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
# Abalone RAG Chatbot
This project implements a Retrieval-Augmented Generation (RAG) chatbot about Abalone using LangChain + OpenAI with a Streamlit frontend. It's designed to be deployed on Hugging Face Spaces.
Contents
- `app.py` - Streamlit app entrypoint
- `src/ingest.py` - Ingest files from `data/` into a persisted Chroma vectorstore
- `src/vectorstore.py` - Helpers to build/load the Chroma vectorstore and return a retriever
- `src/qa_chain.py` - Build the conversational retrieval QA chain
- `data/` - Put Abalone source files here (CSV/MD/TXT/PDF)
- `vectorstore/` - Persisted vectorstore directory (created by ingestion)
Quickstart (local)
1. Create a venv and install dependencies:
```bash
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```
2. Set your OpenAI API key:
```bash
export OPENAI_API_KEY="sk-..."
```
3. Add Abalone files into `data/` (for example `abalone.csv`).
4. Build the vectorstore:
```bash
python -m src.ingest --data-dir ./data --persist-dir ./vectorstore
```
5. Run the Streamlit app:
```bash
streamlit run app.py
```
Deploying to Hugging Face Spaces
- Add `OPENAI_API_KEY` in the Spaces secrets (Settings -> Secrets).
- Push this repository to your HF Space. HF will install `requirements.txt` and run the Streamlit app.
- On first run, click the "Ingest data" button or allow the app to rebuild the index.
Security
- Do NOT commit your OpenAI API key. Use HF Spaces Secrets for deployment.
License
- MIT
|