Spaces:
Sleeping
Sleeping
Commit ·
87370c1
1
Parent(s): 5847e55
added readme
Browse files- README.md +60 -12
- src/chat.py +1 -7
README.md
CHANGED
|
@@ -1,12 +1,60 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# QA with RAG
|
| 2 |
+
|
| 3 |
+
### Quick start
|
| 4 |
+
The script is designed to be a Hugging Face chat interface, allowing users to simply use the chat without installing any dependencies.
|
| 5 |
+
This is a link to the chat: [QA with RAG](https://huggingface.co/spaces/alexandraroze/rag_test_task).
|
| 6 |
+
|
| 7 |
+
This chat uses a pre-built RAG model (instructions on how to run the script with RAG building will be described below).
|
| 8 |
+
|
| 9 |
+
### Key features
|
| 10 |
+
1. To start the chat, write a question in the chat and press Enter.
|
| 11 |
+
2. The model saves the history of all conversations, so you can ask questions about previous answers.
|
| 12 |
+
3. For each of your questions, the model identifies the topic you are interested in and extracts this topic from the query.
|
| 13 |
+
4. The relevant documents are retrieved from the RAG using the extracted topic, so the model can base its answer on the retrieved documents.
|
| 14 |
+
5. If your question does not relate to a specific topic or clarifies the previous model's message, the topic won't be extracted, and the model will use previously retrieved documents.
|
| 15 |
+
6. Retrieved documents are listed below the chat, as well as the extracted topic (if the topic is not extracted, it will be empty).
|
| 16 |
+
|
| 17 |
+
### How to run RAG building script
|
| 18 |
+
Before launching the script, you should create a file `.env` in the root directory with the following content:
|
| 19 |
+
```
|
| 20 |
+
OPENAI_API_KEY="your_openai_token"
|
| 21 |
+
OPENAI_EMBEDDINGS_MODEL="text-embedding-3-large"
|
| 22 |
+
CHAT_MODEL="gpt-4o"
|
| 23 |
+
PATH_TO_DATASET="Dataset"
|
| 24 |
+
PATH_TO_INDEX="faiss_db"
|
| 25 |
+
```
|
| 26 |
+
Please, do not change `OPENAI_EMBEDDINGS_MODEL` value.
|
| 27 |
+
|
| 28 |
+
|
| 29 |
+
To run the script which builds RAG, you need to launch the following commands:
|
| 30 |
+
```bash
|
| 31 |
+
pip install -r requirements.txt
|
| 32 |
+
python ./build_rag.py --path_to_dataset Dataset --path_to_index faiss_db
|
| 33 |
+
```
|
| 34 |
+
|
| 35 |
+
The script will build the RAG model and save it to the specified path. It uses the dataset from the `Dataset` folder and saves the index to the `faiss_db` folder.
|
| 36 |
+
|
| 37 |
+
### How to test retrieval from RAG separately
|
| 38 |
+
If you want to look at retrieval process without chat interface, you can run the following command:
|
| 39 |
+
```bash
|
| 40 |
+
python ./test_rag.py --path_to_index faiss_db
|
| 41 |
+
```
|
| 42 |
+
After launching the script, you will be able to enter your queries and see the retrieved documents. To exit the script, enter `exit`.
|
| 43 |
+
|
| 44 |
+
### Implementation details
|
| 45 |
+
#### Splitting documents
|
| 46 |
+
I wrote my own splitter for splitting documents since existing splitters do not consider the semantic meaning of the text. (There are some splitters that consider semantic meaning, but I did not like their quality.)
|
| 47 |
+
|
| 48 |
+
- This splitter works like Agglomerative Clustering but considers the order of sentences in the text.
|
| 49 |
+
- It splits the text into clusters of sentences, where each cluster contains sentences that are semantically close to each other and form a sequential order.
|
| 50 |
+
- The splitter uses embeddings from the OpenAI embeddings model to calculate the similarity between sentences.
|
| 51 |
+
- Each cluster represents a separate document in the RAG index.
|
| 52 |
+
|
| 53 |
+
All implementation details are in the `src/rag.py` file.
|
| 54 |
+
|
| 55 |
+
#### Indexing
|
| 56 |
+
I used Faiss library and OpenAI embeddings model, namely text-embedding-3-large, since it is the latest and one of the best models for text embeddings.
|
| 57 |
+
|
| 58 |
+
#### Chat interface
|
| 59 |
+
I used Langchain library for the chat interface, since it allows to easily create a chat which saves history of all conversation.
|
| 60 |
+
Implementation details are in the `src/chat.py` file.
|
src/chat.py
CHANGED
|
@@ -15,11 +15,6 @@ GENERATE_ARGS = {
|
|
| 15 |
'max_tokens': int(os.getenv("MAX_NEW_TOKENS", 1024)),
|
| 16 |
}
|
| 17 |
|
| 18 |
-
GENERATE_KWARGS = {
|
| 19 |
-
'top_p': float(os.getenv("TOP_P", 0.6)),
|
| 20 |
-
'frequency_penalty': max(-2, min(float(os.getenv("FREQ_PENALTY", 0)), 2))
|
| 21 |
-
}
|
| 22 |
-
|
| 23 |
|
| 24 |
class Chat:
|
| 25 |
|
|
@@ -31,8 +26,7 @@ class Chat:
|
|
| 31 |
self.assistant_model = base(
|
| 32 |
model=model,
|
| 33 |
streaming=True,
|
| 34 |
-
**GENERATE_ARGS
|
| 35 |
-
model_kwargs=GENERATE_KWARGS
|
| 36 |
)
|
| 37 |
|
| 38 |
self.store = {}
|
|
|
|
| 15 |
'max_tokens': int(os.getenv("MAX_NEW_TOKENS", 1024)),
|
| 16 |
}
|
| 17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
class Chat:
|
| 20 |
|
|
|
|
| 26 |
self.assistant_model = base(
|
| 27 |
model=model,
|
| 28 |
streaming=True,
|
| 29 |
+
**GENERATE_ARGS
|
|
|
|
| 30 |
)
|
| 31 |
|
| 32 |
self.store = {}
|