Spaces:

transZ
/

test_parascore

Paused

App Files Files Community

test_parascore / README.md

transZ

Testing upload

e8bd09a almost 3 years ago

preview code

raw

history blame contribute delete

1.76 kB

	---
	title: Test ParaScore
	emoji: 🤗
	colorFrom: blue
	colorTo: red
	sdk: gradio
	sdk_version: 3.0.2
	app_file: app.py
	pinned: false
	tags:
	- evaluate
	- metric
	description: >-
	ParaScore is a new metric to scoring the performance of paraphrase generation tasks

	See the project at https://github.com/shadowkiller33/ParaScore for more information.
	---

	# Metric Card for ParaScore

	## Metric description

	ParaScore is a new metric to scoring the performance of paraphrase generation tasks

	## How to use

	```python
	from evaluate import load
	bertscore = load("transZ/test_parascore")
	predictions = ["hello there", "general kenobi"]
	references = ["hello there", "general kenobi"]
	results = bertscore.compute(predictions=predictions, references=references, lang="en")
	```

	## Output values

	ParaScore outputs a dictionary with the following values:

	`score`: Range from 0.0 to 1.0

	## Limitations and bias

	The [original ParaScore paper](https://arxiv.org/abs/2202.08479) showed that ParaScore correlates well with human judgment on sentence-level and system-level evaluation, but this depends on the model and language pair selected.

	## Citation

	```bibtex
	@article{Shen2022,
	archivePrefix = {arXiv},
	arxivId = {2202.08479},
	author = {Shen, Lingfeng and Liu, Lemao and Jiang, Haiyun and Shi, Shuming},
	journal = {EMNLP 2022 - 2022 Conference on Empirical Methods in Natural Language Processing, Proceedings},
	eprint = {2202.08479},
	month = {feb},
	number = {1},
	pages = {3178--3190},
	title = {{On the Evaluation Metrics for Paraphrase Generation}},
	url = {http://arxiv.org/abs/2202.08479},
	year = {2022}
	}
	```

	## Further References
	- [Offcial implementation](https://github.com/shadowkiller33/parascore_toolkit)