Spaces:
Runtime error
Runtime error
Added description
Browse files
semf1.py
CHANGED
|
@@ -11,7 +11,7 @@
|
|
| 11 |
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
| 12 |
# See the License for the specific language governing permissions and
|
| 13 |
# limitations under the License.
|
| 14 |
-
# TODO: Add test cases,
|
| 15 |
"""SEM-F1 metric"""
|
| 16 |
|
| 17 |
import abc
|
|
@@ -58,19 +58,51 @@ _KWARGS_DESCRIPTION = """
|
|
| 58 |
SEM-F1 compares the system generated overlap summary with ground truth reference overlap.
|
| 59 |
|
| 60 |
Args:
|
| 61 |
-
predictions:
|
| 62 |
-
references:
|
| 63 |
reference should be a string with tokens separated by spaces.
|
| 64 |
model_type: str - Model to use. [pv1, stsb, use]
|
| 65 |
Options:
|
| 66 |
-
pv1 - paraphrase-distilroberta-base-v1
|
| 67 |
stsb - stsb-roberta-large
|
| 68 |
use - Universal Sentence Encoder
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
Returns:
|
| 70 |
precision: Precision.
|
| 71 |
recall: Recall.
|
| 72 |
f1: F1 score.
|
| 73 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
Examples:
|
| 75 |
|
| 76 |
>>> import evaluate
|
|
|
|
| 11 |
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
| 12 |
# See the License for the specific language governing permissions and
|
| 13 |
# limitations under the License.
|
| 14 |
+
# TODO: Add test cases, Remove tokenize_sentences flag since it can be determined from the input itself.
|
| 15 |
"""SEM-F1 metric"""
|
| 16 |
|
| 17 |
import abc
|
|
|
|
| 58 |
SEM-F1 compares the system generated overlap summary with ground truth reference overlap.
|
| 59 |
|
| 60 |
Args:
|
| 61 |
+
predictions: list - List of predictions (Details below)
|
| 62 |
+
references: list - List of references (Details below)
|
| 63 |
reference should be a string with tokens separated by spaces.
|
| 64 |
model_type: str - Model to use. [pv1, stsb, use]
|
| 65 |
Options:
|
| 66 |
+
pv1 - paraphrase-distilroberta-base-v1 (Default)
|
| 67 |
stsb - stsb-roberta-large
|
| 68 |
use - Universal Sentence Encoder
|
| 69 |
+
tokenize_sentences: bool - Sentence tokenize the input document (prediction/reference). Default: True.
|
| 70 |
+
gpu: Union[bool, int] - Whether to use GPU or CPU.
|
| 71 |
+
Options:
|
| 72 |
+
False - CPU (Default)
|
| 73 |
+
True - GPU, device 0
|
| 74 |
+
n: int - GPU, device n
|
| 75 |
+
batch_size: int - Batch Size, Default = 32.
|
| 76 |
Returns:
|
| 77 |
precision: Precision.
|
| 78 |
recall: Recall.
|
| 79 |
f1: F1 score.
|
| 80 |
|
| 81 |
+
There are 4 possible cases for inputs corresponding to predictions and references arguments
|
| 82 |
+
Case 1: Multi-Ref = False, tokenize_sentences = False
|
| 83 |
+
predictions: List[List[str]] - List of predictions where each prediction is a list of sentences.
|
| 84 |
+
references: List[List[str]] - List of references where each reference is a list of sentences.
|
| 85 |
+
Case 2: Multi-Ref = False, tokenize_sentences = True
|
| 86 |
+
predictions: List[str] - List of predictions where each prediction is a document
|
| 87 |
+
references: List[str] - List of references where each reference is a document
|
| 88 |
+
Case 3: Multi-Ref = True, tokenize_sentences = False
|
| 89 |
+
predictions: List[List[str]] - List of predictions where each prediction is a list of sentences.
|
| 90 |
+
references: List[List[List[str]]] - List of multi-references i.e. [[r11, r12, ...], [r21, r22, ...], ...]
|
| 91 |
+
where each rij is further a list of sentences
|
| 92 |
+
Case 4: Multi-Ref = True, tokenize_sentences = True
|
| 93 |
+
predictions: List[str] - List of predictions where each prediction is a document
|
| 94 |
+
references: List[List[str]] - List of multi-references i.e. [[r11, r12, ...], [r21, r22, ...], ...]
|
| 95 |
+
where each rij is a document
|
| 96 |
+
|
| 97 |
+
This can be seen in the form of truth table as follows:
|
| 98 |
+
Case | Multi-Ref | tokenize_sentences | predictions | references
|
| 99 |
+
1 | 0 | 0 | List[List[str]] | List[List[str]]
|
| 100 |
+
2 | 0 | 1 | List[str] | List[str]
|
| 101 |
+
3 | 1 | 0 | List[List[str]] | List[List[List[str]]]
|
| 102 |
+
4 | 1 | 1 | List[str] | List[List[str]]
|
| 103 |
+
|
| 104 |
+
It is automatically determined whether it is Multi-Ref case Single-Ref case.
|
| 105 |
+
|
| 106 |
Examples:
|
| 107 |
|
| 108 |
>>> import evaluate
|