Spaces:

kamkol
/

AB_Testing_RAG_Agent_2_o

Sleeping

App Files Files Community

AB_Testing_RAG_Agent_2_o / README.md

kamkol

Add Hugging Face Space configuration

fdcc50c 11 months ago

preview code

raw

history blame contribute delete

2.64 kB

	---
	title: AB Testing RAG Agent
	emoji: 🤖
	colorFrom: blue
	colorTo: green
	sdk: docker
	sdk_version: 3.14
	app_port: 8501
	pinned: false
	---

	# AB Testing RAG Agent

	This application is a Streamlit-based frontend for an AB Testing QA system that uses a carefully designed retrieval-augmented generation (RAG) approach with a LangGraph architecture.

	## Features

	- QA system specialized in AB Testing topics
	- Intelligent query routing with LangGraph
	- Source citations for all answers
	- Streamlit interface for easy interaction

	## Setup for Development

	### Prerequisites

	- Python 3.9+
	- OpenAI API key
	- Huggingface account and token (for deployment)

	### Environment Setup

	1. Clone this repository
	2. Create a `.env` file in the root directory with the following content:
	```
	OPENAI_API_KEY=your_openai_api_key_here
	HF_TOKEN=your_huggingface_token_here
	```

	### Process the PDFs

	Before running the app, you need to process the PDF files to create the vectorstore:

	```bash
	python process_data.py
	```

	This will:
	1. Load PDFs from `notebook_version/data/`
	2. Process, chunk, and embed the documents
	3. Create a Qdrant vectorstore in `data/processed_data/`

	### Running the App Locally

	Once the data is processed, you can run the Streamlit app:

	```bash
	streamlit run app/app.py
	```

	## Deployment to Huggingface Spaces

	### Prerequisites for Deployment

	1. Huggingface account
	2. Docker installed locally

	### Steps to Deploy

	1. Process the PDFs locally: `python process_data.py`
	2. Build the Docker image: `docker build -t ab-testing-qa .`
	3. Create a new Huggingface Space (Docker-based)
	4. Add your Huggingface token and OpenAI API key as secrets in the space
	5. Push the Docker image to Huggingface

	### Huggingface Spaces Configuration

	The application is configured to use the following secrets:
	- `OPENAI_API_KEY`: Your OpenAI API key
	- `HF_TOKEN`: Your Huggingface token

	## System Architecture

	The AB Testing QA system uses a sophisticated LangGraph architecture:

	1. Initial RAG Node: Retrieves documents and attempts to answer the query
	2. Helpfulness Judge: Determines if:
	- The query is related to AB Testing
	- The initial response is helpful enough
	3. Agent Node: If needed, uses specialized tools to improve the answer:
	- Standard retrieval tool
	- Query-rephrasing retrieval tool
	- ArXiv search tool

	## Data Processing

	The system processes PDFs using a specific approach:
	1. Merges PDF pages while maintaining page metadata
	2. Uses RecursiveCharacterTextSplitter with specific parameters
	3. Embeds using OpenAI's text-embedding-3-small model
	4. Stores in a Qdrant vectorstore