fix README.md
Browse files
README.md
CHANGED
|
@@ -13,7 +13,7 @@ The architecture of iFlyBotVLM is designed to realize four critical functional c
|
|
| 13 |
**🧭Spatial Understanding and Metric**: Provides the model with the capacity to understand spatial relationships and perform relative position estimation among objects in the environment.
|
| 14 |
**🎯Interactive Target Grounding**: Supports diverse grounding mechanisms, including 2D/3D object detection in the visual modality, language-based object and spatial referring, and the prediction of critical object affordance regions.
|
| 15 |
**🤖Action Abstraction and Control Parameter Generation**: Generates outputs directly relevant to the manipulation domain, providing grasp poses and manipulation trajectories.
|
| 16 |
-
**📋Task Planning**: Leveraging the current scene
|
| 17 |
|
| 18 |
We anticipate that iFlyBotVLM will serve as an efficient and scalable foundation model, driving the advancement of embodied AI from single-task capabilities toward generalist intelligent agents.
|
| 19 |
|
|
@@ -48,7 +48,7 @@ iFlyBotVLM demonstrates superior performance across various challenging benchmar
|
|
| 48 |
|
| 49 |

|
| 50 |
|
| 51 |
-
iFlyBotVLM-8B achieves state-of-the-art (SOTA) or near-SOTA performance on ten spatial
|
| 52 |
|
| 53 |
## 🚀Quick Start
|
| 54 |
|
|
|
|
| 13 |
**🧭Spatial Understanding and Metric**: Provides the model with the capacity to understand spatial relationships and perform relative position estimation among objects in the environment.
|
| 14 |
**🎯Interactive Target Grounding**: Supports diverse grounding mechanisms, including 2D/3D object detection in the visual modality, language-based object and spatial referring, and the prediction of critical object affordance regions.
|
| 15 |
**🤖Action Abstraction and Control Parameter Generation**: Generates outputs directly relevant to the manipulation domain, providing grasp poses and manipulation trajectories.
|
| 16 |
+
**📋Task Planning**: Leveraging the current scene Understanding, this module performs multi-step prediction to decompose complex tasks into a sequence of atomic skills, fundamentally supporting the robust execution of long-horizon tasks.
|
| 17 |
|
| 18 |
We anticipate that iFlyBotVLM will serve as an efficient and scalable foundation model, driving the advancement of embodied AI from single-task capabilities toward generalist intelligent agents.
|
| 19 |
|
|
|
|
| 48 |
|
| 49 |

|
| 50 |
|
| 51 |
+
iFlyBotVLM-8B achieves state-of-the-art (SOTA) or near-SOTA performance on ten spatial Understanding, spatial perception, and temporal task planning benchmarks: Where2Place, Refspatial-bench, ShareRobot-affordance, ShareRobot-trajectory, BLINK(spatial), EmbSpatial, ERQA, CVBench, SAT, EgoPlan2.
|
| 52 |
|
| 53 |
## 🚀Quick Start
|
| 54 |
|