Spaces:
Sleeping
Sleeping
Rodrigo Ferreira Rodrigues commited on
Commit ·
8665589
1
Parent(s): 9fea096
Updating Readme.md
Browse files
README.md
CHANGED
|
@@ -14,37 +14,65 @@ pinned: false
|
|
| 14 |
|
| 15 |
# Metric Card for Path_Planning_evaluate
|
| 16 |
|
| 17 |
-
|
| 18 |
|
| 19 |
## Metric Description
|
| 20 |
-
|
|
|
|
|
|
|
| 21 |
|
| 22 |
## How to Use
|
| 23 |
-
*Give general statement of how to use the metric*
|
| 24 |
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
-
|
| 28 |
-
*List all input arguments in the format below*
|
| 29 |
-
- **input_field** *(type): Definition of input, with explanation if necessary. State any default value(s).*
|
| 30 |
|
| 31 |
### Output Values
|
| 32 |
|
| 33 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
-
*State the range of possible values that the metric's output can take, as well as what in that range is considered good. For example: "This metric can take on any value between 0 and 100, inclusive. Higher scores are better."*
|
| 36 |
|
| 37 |
#### Values from Popular Papers
|
| 38 |
-
*Give examples, preferrably with links to leaderboards or publications, to papers that have reported this metric, along with the values they have reported.*
|
| 39 |
|
| 40 |
### Examples
|
| 41 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 42 |
|
| 43 |
## Limitations and Bias
|
| 44 |
-
*Note any known limitations or biases that the metric has, with links and references if possible.*
|
| 45 |
|
| 46 |
## Citation
|
| 47 |
-
*Cite the source where this metric was introduced.*
|
| 48 |
|
| 49 |
## Further References
|
| 50 |
-
*Add any useful further references.*
|
|
|
|
| 14 |
|
| 15 |
# Metric Card for Path_Planning_evaluate
|
| 16 |
|
| 17 |
+
This metric is used to evaluate path planning tasks where an LM as to generate a valid path going from a starting point to one or multiple end points in a grid and by avoiding all the obstacles.
|
| 18 |
|
| 19 |
## Metric Description
|
| 20 |
+
|
| 21 |
+
This metric is used to evaluate path planning tasks where an LM as to generate a valid path going from a starting point to one or multiple end points in a grid and by avoiding all the obstacles.
|
| 22 |
+
|
| 23 |
|
| 24 |
## How to Use
|
|
|
|
| 25 |
|
| 26 |
+
This metric takes 5 mandatory arguments : `generations` (a list of string), `golds` (a list of list of integers corresponding to the gold paths in a list format), `obstacles` (a list of list of integers corresponding to the coordinates of the obstacles for each question), `ends` (a list of list of integers corresponding to the coordinates of the ending points for each question) and `n` (a list of integers corresponding to the size pf the grid).
|
| 27 |
+
|
| 28 |
+
```python
|
| 29 |
+
import evaluate
|
| 30 |
+
pp_eval = evaluate.load("rfr2003/path_planning_evaluate")
|
| 31 |
+
results = pp_eval.compute(
|
| 32 |
+
generations=['[(0,0), (0,1), (1,1)]', '[(0,0), (1,0), (1,1)]', '[(0,0), (1,0), (1,1), (0,1)]', '(0,0'],
|
| 33 |
+
golds=[[(0,0), (0,1), (1,1)], [(0,0), (0,1), (1,1)], [(0,0), (0,1)], []],
|
| 34 |
+
obstacles=[[(1,0)], [(1,0)], [], []],
|
| 35 |
+
ends=[[(1,1)], [(1,1)], [(0,1)], [(0,1)]],
|
| 36 |
+
n=[2, 2, 2, 2]
|
| 37 |
+
)
|
| 38 |
+
print(results)
|
| 39 |
+
{'compliance_ratio': 0.75, 'success_ratio': 0.6666666666666666,'optimal_ratio': 0.3333333333333333, 'feasible_ratio': 0.6666666666666666, 'distance': 0, 'unreachable_acc': 1.0}
|
| 40 |
+
```
|
| 41 |
|
| 42 |
+
This metric doesn't take any optionnal arguments.
|
|
|
|
|
|
|
| 43 |
|
| 44 |
### Output Values
|
| 45 |
|
| 46 |
+
This metric outputs a dictionary with the following values:
|
| 47 |
+
|
| 48 |
+
`compliance_ratio`: The ratio of `generations` that complied to a list format across all questions, which ranges from 0.0 to 1.0.
|
| 49 |
+
`feasible_ratio`: The ratio of `generations` that are feasable among all reachable questions, which ranges from 0.0 to 1.0.
|
| 50 |
+
`sucess_ratio`: The ratio of `generations` that are correct among all reachable questions, which ranges from 0.0 to 1.0.
|
| 51 |
+
`optimal_ratio`: The ratio of `generations` that are optimal among all reachable questions, which ranges from 0.0 to 1.0.
|
| 52 |
+
`distance`: The mean distance to the end point for feasable paths that were not correct, it's a positive real.
|
| 53 |
+
`unreachable_acc`: The ratio of detected unreachable paths among all unreachable paths, which ranges from 0.0 to 1.0.
|
| 54 |
|
|
|
|
| 55 |
|
| 56 |
#### Values from Popular Papers
|
|
|
|
| 57 |
|
| 58 |
### Examples
|
| 59 |
+
|
| 60 |
+
```python
|
| 61 |
+
import evaluate
|
| 62 |
+
pp_eval = evaluate.load("rfr2003/path_planning_evaluate")
|
| 63 |
+
results = pp_eval.compute(
|
| 64 |
+
generations=['[(0,0), (0,1), (1,1)]', '[(0,0), (1,0), (1,1)]', '[(0,0), (1,0), (1,1), (0,1)]', '(0,0'],
|
| 65 |
+
golds=[[(0,0), (0,1), (1,1)], [(0,0), (0,1), (1,1)], [(0,0), (0,1)], []],
|
| 66 |
+
obstacles=[[(1,0)], [(1,0)], [], []],
|
| 67 |
+
ends=[[(1,1)], [(1,1)], [(0,1)], [(0,1)]],
|
| 68 |
+
n=[2, 2, 2, 2]
|
| 69 |
+
)
|
| 70 |
+
print(results)
|
| 71 |
+
{'compliance_ratio': 0.75, 'success_ratio': 0.6666666666666666,'optimal_ratio': 0.3333333333333333, 'feasible_ratio': 0.6666666666666666, 'distance': 0, 'unreachable_acc': 1.0}
|
| 72 |
+
```
|
| 73 |
|
| 74 |
## Limitations and Bias
|
|
|
|
| 75 |
|
| 76 |
## Citation
|
|
|
|
| 77 |
|
| 78 |
## Further References
|
|
|