Spaces:
Sleeping
A newer version of the Gradio SDK is available:
6.6.0
title: Rag Test Task
emoji: 💬
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 4.36.1
app_file: app.py
pinned: false
QA with RAG
Quick start
The script is designed to be a Hugging Face chat interface, allowing users to simply use the chat without installing any dependencies. This is a link to the chat: QA with RAG.
This chat uses a pre-built RAG model (instructions on how to run the script with RAG building will be described below).
Key features
- To start the chat, write a question in the chat and press Enter.
- The model saves the history of all conversations, so you can ask questions about previous answers.
- For each of your questions, the model identifies the topic you are interested in and extracts this topic from the query.
- The relevant documents are retrieved from the RAG using the extracted topic, so the model can base its answer on the retrieved documents.
- If your question does not relate to a specific topic or clarifies the previous model's message, the topic won't be extracted, and the model will use previously retrieved documents.
- Retrieved documents are listed below the chat, as well as the extracted topic (if the topic is not extracted, it will be empty).
How to run RAG building script
Before launching the script, you should create a file .env in the root directory with the following content:
OPENAI_API_KEY="your_openai_token"
OPENAI_EMBEDDINGS_MODEL="text-embedding-3-large"
CHAT_MODEL="gpt-4o"
PATH_TO_DATASET="Dataset"
PATH_TO_INDEX="faiss_db"
Please, do not change OPENAI_EMBEDDINGS_MODEL value.
Note: there can be problems with opening pdf files, which are downloaded from the repository. In this case, you can replace folder
Datasetwith the folder which you have sent me.
To run the script which builds RAG, you need to launch the following commands:
pip install -r requirements.txt
python -m nltk.downloader punkt
python ./build_rag.py --path_to_dataset Dataset --path_to_index faiss_db
The script will build the RAG model and save it to the specified path. It uses the dataset from the Dataset folder and saves the index to the faiss_db folder.
How to test retrieval from RAG separately
If you want to look at retrieval process without chat interface, you can run the following command:
python ./test_rag.py --path_to_index faiss_db
After launching the script, you will be able to enter your queries and see the retrieved documents. To exit the script, enter exit.
Implementation details
Splitting documents
There are several semantic splitters, which can be appropriate for this task, but I decided to build my own splitter, because in case if we want to build scalable system, we must have flexible splitter, which can be easily modified and adapted to the specific requirements.
- This splitter works like Agglomerative Clustering but considers the order of sentences in the text.
- It splits the text into clusters of sentences, where each cluster contains sentences that are semantically close to each other and form a sequential order.
- The splitter uses embeddings from the OpenAI embeddings model to calculate the similarity between sentences.
- Each cluster represents a separate document in the RAG index.
All implementation details are in the src/rag.py file.
Indexing
I used Faiss library and OpenAI embeddings model, namely text-embedding-3-large, since it is the latest and one of the best models for text embeddings.
Chat interface
I used Langchain library for the chat interface, since it allows to easily create a chat which saves history of all conversation.
Implementation details are in the src/chat.py file.