Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -7,4 +7,17 @@ sdk: static
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
+
To test your RAG solution it would be powerful to have access to a dataset that consists of a text corpus,
|
| 11 |
+
correct responses to queries (e.g. question-answer) to test the solution end-to-end and maybe even a set of relevant passages
|
| 12 |
+
from the text corpus for each query to test the retrieval component separately as well.
|
| 13 |
+
We call this a question-answer-passages dataset.
|
| 14 |
+
|
| 15 |
+
There are plenty of large-scale datasets of this kind such as [Google's Natural Questions](https://ai.google.com/research/NaturalQuestions/).
|
| 16 |
+
|
| 17 |
+
Still we lack such datasets that are **small-scale** and **narrow-domain** to just test our RAG solution quickly or to see how it performs
|
| 18 |
+
in a certain domain context.
|
| 19 |
+
|
| 20 |
+
We created this space to create a collections of such datasets to boost the developement of RAG solutions.
|
| 21 |
+
|
| 22 |
+
Datasets consist of:
|
| 23 |
+
* asdf
|