Spaces:
Sleeping
Sleeping
| title: Levenshtein distance | |
| emoji: ✍️ | |
| colorFrom: blue | |
| colorTo: green | |
| tags: | |
| - evaluate | |
| - metric | |
| description: Levenshtein (edit) distance | |
| sdk: gradio | |
| sdk_version: 5.24.0 | |
| app_file: app.py | |
| pinned: false | |
| # Metric Card for the Levenshtein (edit) distance | |
| ## Metric Description | |
| This metric computes the Levenshtein distance, also commonly called "edit distance". The Levenshtein distance measures the number of combined insertions, deletions and substitutions operations (one per character) to perform on a string so that it becomes identical to a second one. It is a popular metric for text similarity. | |
| This module directly calls the [Levenshtein package](https://github.com/rapidfuzz/Levenshtein) for fast execution speed. | |
| ## How to Use | |
| ### Inputs | |
| *List all input arguments in the format below* | |
| - **predictions** *(string): sequence of prediction strings* | |
| - **references** *(string): sequence of reference string;* | |
| - **kwargs** *keyword arguments to pass to the [Levenshtein.distance](https://rapidfuzz.github.io/Levenshtein/levenshtein.html#Levenshtein.distance) method.* | |
| ### Output Values | |
| Dictionary mapping to the average Levenshtein distance (lower is better) and the ratio ([0, 1]) distance (higher is better). | |
| ### Examples | |
| #### Levenshtein distance | |
| ```Python | |
| import evaluate | |
| levenshtein = evaluate.load("Natooz/Levenshtein") | |
| results = levenshtein.compute( | |
| predictions=[ | |
| "foo", "baroo" # 0 and 2 edits | |
| ], | |
| references=[ | |
| "foo", "bar" | |
| ], | |
| ) | |
| print(results) | |
| # {"levenshtein": 1, "levenshtein_ratio": 0.875} | |
| ``` | |
| #### Indel (insertion-deletion) distance | |
| The weight of each operation can be provided in order to customize the score. For example, the substitution score can be set to 2 to compute the "indel" distance, so that each substitution is counted as two operations (deletion + insertion). | |
| ```Python | |
| import evaluate | |
| levenshtein = evaluate.load("Natooz/Levenshtein") | |
| results = levenshtein.compute( | |
| predictions=[ | |
| "foo", "baroo" # 0 and 2 edits | |
| ], | |
| references=[ | |
| "foo", "bar" | |
| ], | |
| weights=(1, 1, 2), # weight of 2 for substitutions | |
| ) | |
| print(results) | |
| # {"levenshtein": 1, "levenshtein_ratio": 0.875} | |
| ``` | |
| ## Citation | |
| ```bibtex | |
| @ARTICLE{1966SPhD...10..707L, | |
| author = {{Levenshtein}, V.~I.}, | |
| title = "{Binary Codes Capable of Correcting Deletions, Insertions and Reversals}", | |
| journal = {Soviet Physics Doklady}, | |
| year = 1966, | |
| month = feb, | |
| volume = {10}, | |
| pages = {707}, | |
| adsurl = {https://ui.adsabs.harvard.edu/abs/1966SPhD...10..707L}, | |
| adsnote = {Provided by the SAO/NASA Astrophysics Data System} | |
| } | |
| ``` |