Buckets:

DylanJHJ
/

APRIL

9 days ago

1.62 kB

	# A Python toy example for using AutoReranker

	### Relevance-based IR Data
	To accomondate to the standard input format of AutoReranker, the example data is organized as three dictionaries: `run`, `queries`, and `corpus`.
	Below is an example of how to structure these dictionaries.
	```python
	run = {
	"q1": {"d2": 0.95, "d1*": 0.70, "d6": 0.25},
	"q2": {"d4*": 0.88, "d3": 0.73, "d7": 0.20},
	"q3": {"d5": 0.91, "d8": 0.60, "d9": 0.40}
	}

	queries = {
	"q1": "What city is the capital of France?",
	"q2": "Who painted the Mona Lisa?",
	"q3": "√144 equals?"
	}

	corpus = {
	"d1*": "Paris is the capital of France.",
	"d2": "London is the capital of the UK.",
	"d3": "Vincent van Gogh painted 'The Starry Night'.",
	"d4*": "The painter of the Mona Lisa was Leonardo da Vinci.",
	"d5*": "The square root of 144 is 12.",
	"d6": "Berlin is the capital of Germany.",
	"d7": "Pablo Picasso painted 'Guernica'.",
	"d8": "The cube root of 27 is 3.",
	"d9*": "12 is the positive solution to √144."
	}

	qrel = {
	"q1": {"d1*": 1},
	"q2": {"d4*": 1},
	"q3": {"d5": 1, "d9": 1}
	}
	```

	### Initialize a reranker and rerank
	Once the data is structured, you can initialize the `ModularReranker` with the prebuilt method and use it to rerank the documents based on the queries.

	We use `ir_measures` library to evaluate the reranked results
	```python
	reranker = ModularReranker.from_prebuilt('rankgpt', 'Qwen/Qwen2.5-7B-Instruct')
	reranked_result = reranker.rerank(run=run, queries=queries, corpus=corpus)
	print(ir_measures.calc_aggregate([RR@5], qrel, reranked_result))
	```

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.