|
|
--- |
|
|
title: SPICE |
|
|
tags: |
|
|
- evaluate |
|
|
- metric |
|
|
description: "SPICE (Semantic Propositional Image Caption Evaluation) is a metric for evaluating the quality of image captions by measuring semantic similarity." |
|
|
sdk: gradio |
|
|
sdk_version: 5.45.0 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
# Metric Card for SPICE |
|
|
|
|
|
***Module Card Instructions:*** *This module calculates the SPICE metric for evaluating image captioning models.* |
|
|
|
|
|
**Can not support Apple Silicon, and make sure you have already installed JDK 8/11.** |
|
|
|
|
|
## Metric Description |
|
|
|
|
|
*SPICE (Semantic Propositional Image Caption Evaluation) is a metric for evaluating the quality of image captions. It measures the semantic similarity between the generated captions and a set of reference captions by analyzing the underlying semantic propositions.* |
|
|
|
|
|
## How to Use |
|
|
|
|
|
*To use the SPICE metric, you need to provide a set of generated captions and a set of reference captions. The metric will then compute the SPICE score based on the semantic similarity between the two sets of captions.* |
|
|
|
|
|
*Here is a simple example of using the SPICE metric:* |
|
|
|
|
|
### Inputs |
|
|
|
|
|
*List all input arguments in the format below* |
|
|
- **predictions** *(list of list of strings): The generated captions to evaluate.* |
|
|
- **references** *(list of list of strings): The reference captions for each generated caption.* |
|
|
|
|
|
### Output Values |
|
|
|
|
|
*List all output values in the format below* |
|
|
- **metric_score** *(list of dict): The SPICE score representing the semantic similarity between the generated and reference captions.* |
|
|
|
|
|
### Examples |
|
|
|
|
|
```python |
|
|
import evaluate |
|
|
|
|
|
metric = evaluate.load("sunhill/spice") |
|
|
results = metric.compute( |
|
|
predictions=[["train traveling down a track in front of a road"]], |
|
|
references=[ |
|
|
[ |
|
|
"a train traveling down tracks next to lights", |
|
|
"a blue and silver train next to train station and trees", |
|
|
"a blue train is next to a sidewalk on the rails", |
|
|
"a passenger train pulls into a train station", |
|
|
"a train coming down the tracks arriving at a station", |
|
|
] |
|
|
] |
|
|
) |
|
|
print(results) |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@inproceedings{spice2016, |
|
|
title = {SPICE: Semantic Propositional Image Caption Evaluation}, |
|
|
author = {Peter Anderson and Basura Fernando and Mark Johnson and Stephen Gould}, |
|
|
year = {2016}, |
|
|
booktitle = {ECCV} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Further References |
|
|
|
|
|
- [SPICE](https://github.com/peteanderson80/SPICE) |
|
|
- [Image Caption Metrics](https://github.com/EricWWWW/image-caption-metrics) |
|
|
|