Spaces:

nowhuggingface
/

LLM1-Fine-tuning-and-deployment

Sleeping

App Files Files Community

LLM1-Fine-tuning-and-deployment / evaluate /metrics /trec_eval /README.md

nowhuggingface

Add my folder

d733479 2 months ago

preview code

raw

history blame contribute delete

4.28 kB

	---
	title: TREC Eval
	emoji: 🤗
	colorFrom: blue
	colorTo: red
	sdk: gradio
	sdk_version: 3.19.1
	app_file: app.py
	pinned: false
	tags:
	- evaluate
	- metric
	description: >-
	The TREC Eval metric combines a number of information retrieval metrics such as precision and nDCG. It is used to score rankings of retrieved documents with reference values.
	---

	# Metric Card for TREC Eval

	## Metric Description

	The TREC Eval metric combines a number of information retrieval metrics such as precision and normalized Discounted Cumulative Gain (nDCG). It is used to score rankings of retrieved documents with reference values.

	## How to Use
	```Python
	from evaluate import load
	trec_eval = load("trec_eval")
	results = trec_eval.compute(predictions=[run], references=[qrel])
	```

	### Inputs
	- predictions (dict): a single retrieval run.
	- query (int): Query ID.
	- q0 (str): Literal `"q0"`.
	- docid (str): Document ID.
	- rank (int): Rank of document.
	- score (float): Score of document.
	- system (str): Tag for current run.
	- references (dict): a single qrel.
	- query (int): Query ID.
	- q0 (str): Literal `"q0"`.
	- docid (str): Document ID.
	- rel (int): Relevance of document.

	### Output Values
	- runid (str): Run name.
	- num_ret (int): Number of retrieved documents.
	- num_rel (int): Number of relevant documents.
	- num_rel_ret (int): Number of retrieved relevant documents.
	- num_q (int): Number of queries.
	- map (float): Mean average precision.
	- gm_map (float): geometric mean average precision.
	- bpref (float): binary preference score.
	- Rprec (float): precision@R, where R is number of relevant documents.
	- recip_rank (float): reciprocal rank
	- P@k (float): precision@k (k in [5, 10, 15, 20, 30, 100, 200, 500, 1000]).
	- NDCG@k (float): nDCG@k (k in [5, 10, 15, 20, 30, 100, 200, 500, 1000]).

	### Examples

	A minimal example of looks as follows:
	```Python
	qrel = {
	"query": [0],
	"q0": ["q0"],
	"docid": ["doc_1"],
	"rel": [2]
	}
	run = {
	"query": [0, 0],
	"q0": ["q0", "q0"],
	"docid": ["doc_2", "doc_1"],
	"rank": [0, 1],
	"score": [1.5, 1.2],
	"system": ["test", "test"]
	}

	trec_eval = evaluate.load("trec_eval")
	results = trec_eval.compute(references=[qrel], predictions=[run])
	results["P@5"]
	0.2
	```

	A more realistic use case with an examples from [`trectools`](https://github.com/joaopalotti/trectools):

	```python
	qrel = pd.read_csv("robust03_qrels.txt", sep="\s+", names=["query", "q0", "docid", "rel"])
	qrel["q0"] = qrel["q0"].astype(str)
	qrel = qrel.to_dict(orient="list")

	run = pd.read_csv("input.InexpC2", sep="\s+", names=["query", "q0", "docid", "rank", "score", "system"])
	run = run.to_dict(orient="list")

	trec_eval = evaluate.load("trec_eval")
	result = trec_eval.compute(run=[run], qrel=[qrel])
	```

	```python
	result

	{'runid': 'InexpC2',
	'num_ret': 100000,
	'num_rel': 6074,
	'num_rel_ret': 3198,
	'num_q': 100,
	'map': 0.22485930431817494,
	'gm_map': 0.10411523825735523,
	'bpref': 0.217511695914079,
	'Rprec': 0.2502547201167236,
	'recip_rank': 0.6646545943335417,
	'P@5': 0.44,
	'P@10': 0.37,
	'P@15': 0.34600000000000003,
	'P@20': 0.30999999999999994,
	'P@30': 0.2563333333333333,
	'P@100': 0.1428,
	'P@200': 0.09510000000000002,
	'P@500': 0.05242,
	'P@1000': 0.03198,
	'NDCG@5': 0.4101480395089769,
	'NDCG@10': 0.3806761417784469,
	'NDCG@15': 0.37819463408955706,
	'NDCG@20': 0.3686080836061317,
	'NDCG@30': 0.352474353427451,
	'NDCG@100': 0.3778329431025776,
	'NDCG@200': 0.4119129817248979,
	'NDCG@500': 0.4585354576461375,
	'NDCG@1000': 0.49092149290805653}
	```

	## Limitations and Bias
	The `trec_eval` metric requires the inputs to be in the TREC run and qrel formats for predictions and references.


	## Citation

	```bibtex
	@inproceedings{palotti2019,
	author = {Palotti, Joao and Scells, Harrisen and Zuccon, Guido},
	title = {TrecTools: an open-source Python library for Information Retrieval practitioners involved in TREC-like campaigns},
	series = {SIGIR'19},
	year = {2019},
	location = {Paris, France},
	publisher = {ACM}
	}
	```

	## Further References

	- Homepage: https://github.com/joaopalotti/trectools