Update README.md
Browse files
README.md
CHANGED
|
@@ -56,24 +56,29 @@ To **reproduce the environment-specific benchmark results** reported below,
|
|
| 56 |
users should evaluate the **environment-specific checkpoints**
|
| 57 |
`TraceGen_{EnvName}` from [TraceGen Collection](https://huggingface.co/collections/furonghuang-lab/tracegen), which are trained using data from the corresponding environment only.
|
| 58 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
|
| 60 |
-
| Environment | Metric | TraceGen |
|
| 61 |
-
|--------------|----------------|----------|
|
| 62 |
-
| EpicKitchen | MSE | TBD |
|
| 63 |
-
| | MAE | TBD |
|
| 64 |
-
| | Endpoint MSE | TBD |
|
| 65 |
-
| Droid | MSE | TBD |
|
| 66 |
-
| | MAE | TBD |
|
| 67 |
-
| | Endpoint MSE | TBD |
|
| 68 |
-
| Bridge | MSE | TBD |
|
| 69 |
-
| | MAE | TBD |
|
| 70 |
-
| | Endpoint MSE | TBD |
|
| 71 |
-
| Libero | MSE | TBD |
|
| 72 |
-
| | MAE | TBD |
|
| 73 |
-
| | Endpoint MSE | TBD |
|
| 74 |
-
| Robomimic | MSE | TBD |
|
| 75 |
-
| | MAE | TBD |
|
| 76 |
-
| | Endpoint MSE | TBD |
|
| 77 |
|
| 78 |
### Submitting to the Leaderboard
|
| 79 |
|
|
|
|
| 56 |
users should evaluate the **environment-specific checkpoints**
|
| 57 |
`TraceGen_{EnvName}` from [TraceGen Collection](https://huggingface.co/collections/furonghuang-lab/tracegen), which are trained using data from the corresponding environment only.
|
| 58 |
|
| 59 |
+
**Metric definition.**
|
| 60 |
+
All reported errors are computed in a **normalized coordinate space**:
|
| 61 |
+
both input images and predicted traces are scaled to the range **[0, 1]** prior to evaluation.
|
| 62 |
+
Accordingly, the reported MSE, MAE, and Endpoint MSE reflect **absolute errors within the normalized image space**.
|
| 63 |
+
|
| 64 |
+
| Environment | Metric | TraceGen (×1e−2) |
|
| 65 |
+
| ----------- | ------------ | ---------------- |
|
| 66 |
+
| EpicKitchen | MSE | 0.445 |
|
| 67 |
+
| | MAE | 2.721 |
|
| 68 |
+
| | Endpoint MSE | 0.791 |
|
| 69 |
+
| Droid | MSE | 0.206 |
|
| 70 |
+
| | MAE | 1.289 |
|
| 71 |
+
| | Endpoint MSE | 0.285 |
|
| 72 |
+
| Bridge | MSE | 0.653 |
|
| 73 |
+
| | MAE | 2.419 |
|
| 74 |
+
| | Endpoint MSE | 0.607 |
|
| 75 |
+
| Libero | MSE | 0.276 |
|
| 76 |
+
| | MAE | 1.442 |
|
| 77 |
+
| | Endpoint MSE | 0.385 |
|
| 78 |
+
| Robomimic | MSE | TBD |
|
| 79 |
+
| | MAE | TBD |
|
| 80 |
+
| | Endpoint MSE | TBD |
|
| 81 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
|
| 83 |
### Submitting to the Leaderboard
|
| 84 |
|