unicamp-dl
/

InRanker-small

text2text-generation

text-generation-inference

Model card Files Files and versions

Create README.md

#1

by thiagolaitz - opened Jan 16, 2024

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +45 -0

README.md ADDED Viewed

	@@ -0,0 +1,45 @@

+# InRanker-small (60M parameters)
+InRanker is a version of monoT5 distilled from [monoT5-3B](https://huggingface.co/castorini/monot5-3b-msmarco-10k) with increased effectiveness on out-of-domain scenarios.
+Our key insight were to use language models and rerankers to generate as much as possible
+synthetic "in-domain" training data, i.e., data that closely resembles
+the data that will be seen at retrieval time. The pipeline used for training consists of
+two distillation phases that do not require additional user queries
+or manual annotations: (1) training on existing supervised soft
+teacher labels, and (2) training on teacher soft labels for synthetic
+queries generated using a large language model.
+The paper with further details can be found [here](). The code and library are available at
+https://github.com/unicamp-dl/InRanker
+## Usage
+The library was tested using python 3.10 and is installed with:
+```bash
+pip install inranker
+```
+The code for inference is:
+```python
+from inranker import T5Ranker
+model = T5Ranker(model_name_or_path="unicamp-dl/InRanker-small")
+docs = [
+    "The capital of France is Paris",
+    "Learn deep learning with InRanker and transformers"
+]
+scores = model.get_scores(
+    query="What is the best way to learn deep learning?",
+    docs=docs
+)
+# Scores are sorted in descending order (most relevant to least)
+# scores -> [0, 1]
+sorted_scores = sorted(zip(scores, docs), key=lambda x: x[0], reverse=True)
+""" InRanker-small:
+sorted_scores = [
+    (0.4844, 'Learn deep learning with InRanker and transformers'),
+    (7.83e-06, 'The capital of France is Paris')
+]
+"""
+```