Spaces:

cmd0160
/

abalone_chat_application

Running

abalone_chat_application / RAG_APP_README.md

Adding auto embedding

698ce25 24 days ago

1.49 kB

	# Abalone RAG Chatbot

	This project implements a Retrieval-Augmented Generation (RAG) chatbot about Abalone using LangChain + OpenAI with a Streamlit frontend. It's designed to be deployed on Hugging Face Spaces.

	Contents
	- `app.py` - Streamlit app entrypoint
	- `src/ingest.py` - Ingest files from `data/` into a persisted Chroma vectorstore
	- `src/vectorstore.py` - Helpers to build/load the Chroma vectorstore and return a retriever
	- `src/qa_chain.py` - Build the conversational retrieval QA chain
	- `data/` - Put Abalone source files here (CSV/MD/TXT/PDF)
	- `vectorstore/` - Persisted vectorstore directory (created by ingestion)

	Quickstart (local)

	1. Create a venv and install dependencies:

	```bash
	python -m venv .venv
	source .venv/bin/activate
	pip install -r requirements.txt
	```

	2. Set your OpenAI API key:

	```bash
	export OPENAI_API_KEY="sk-..."
	```

	3. Add Abalone files into `data/` (for example `abalone.csv`).

	4. Build the vectorstore:

	```bash
	python -m src.ingest --data-dir ./data --persist-dir ./vectorstore
	```

	5. Run the Streamlit app:

	```bash
	streamlit run app.py
	```

	Deploying to Hugging Face Spaces

	- Add `OPENAI_API_KEY` in the Spaces secrets (Settings -> Secrets).
	- Push this repository to your HF Space. HF will install `requirements.txt` and run the Streamlit app.
	- On first run, click the "Ingest data" button or allow the app to rebuild the index.

	Security
	- Do NOT commit your OpenAI API key. Use HF Spaces Secrets for deployment.

	License
	- MIT