Create README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,24 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
You can use transformer library and load model for conditional generation and expect those tokens or use monoT5 implementation from BEIR.
|
| 2 |
+
|
| 3 |
+
prompt = `Query: {query} Document: {document} Relevant:`
|
| 4 |
+
|
| 5 |
+
Model returns tokens if relevant or not:
|
| 6 |
+
``` token_false='▁fałsz', token_true='▁prawda'```
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
MonoT5 implementation is included in BEIR benchmark(https://github.com/beir-cellar/beir):
|
| 10 |
+
```
|
| 11 |
+
from beir.reranking.models import MonoT5
|
| 12 |
+
from beir.reranking import Rerank
|
| 13 |
+
|
| 14 |
+
queries = YOUR_QUERIES
|
| 15 |
+
corpus = YOUR_CORPUS
|
| 16 |
+
queries = {query['id'] : query['text'] for query in queries}
|
| 17 |
+
corpus = {doc['id']: {'title': doc['title'] , 'text': doc['text']} for doc in corpus}
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
cross_encoder_model = MonoT5(model_path, use_amp=False, token_false='▁fałsz', token_true='▁prawda')
|
| 21 |
+
reranker = Rerank(cross_encoder_model, batch_size=100)
|
| 22 |
+
|
| 23 |
+
rerank_results = reranker.rerank(corpus, queries, results, top_k=100)
|
| 24 |
+
```
|