| # Hierarchical System ROS2 | |
| ## 0- Preface | |
| ### What we have | |
| - The paths and names of the main logic (Python) folders and files are as follows: | |
| ``` | |
| src/ | |
| βββ g0_vlm_node/ | |
| βββ g0_vlm_node | |
| βββ utils/ # Stores functions related to Gemini API processing | |
| βββ vlm_main.py # Core logic for VLM service provision | |
| ``` | |
| - Note: In the above package: | |
| - vlm_main.py | |
| ### Development Log | |
| - VLM | |
| 1. Format the String so that the JSON string sent by EHI is converted into a structured string. | |
| 2. Support the cache switch for receiving repeated instruction from EHI. | |
| 3. Support parameterized startup, using `--use-qwen` and `--no-use-qwen` to control model usage, with Gemini as the default. | |
| ## 1- Install | |
| 1. Install Python dependencies | |
| Refer to https://github.com/whitbrunn/G0 | |
| 2. Compile the workspace | |
| Clone the `src/` folder to the local workspace under `TO/YOUR/WORKSPACE/`, then run: | |
| ``` | |
| cd TO/YOUR/WORKSPACE/ | |
| colcon build --symlink-install --cmake-args -DPython3_ROOT_DIR=$CONDA_PREFIX | |
| ``` | |
| Note: | |
| Use `ros2 pkg list | grep PACK_NAME` to check if the following ROS packages exist: | |
| - `g0_vlm_node` | |
| ## 2- Usage | |
| 1. Set your VLM API key | |
| ``` | |
| export VLM_API_KEY=<YOUR_GEMINI_API_KEY> | |
| export VLM_API_KEY_QWEN=<YOUR_QWEN_API_KEY> | |
| ``` | |
| 2. Start the VLM Node | |
| 1.1 First configure the proxy according to the environment (necessary for Gemini, if using the qwen version, skip to 1.3) | |
| ``` | |
| export https_proxy=http://127.0.0.1:<PORT> | |
| export http_proxy=http://127.0.0.1:<PORT> | |
| export all_proxy=http://127.0.0.1:<PORT> | |
| ``` | |
| 1.2 Verify if the external network is accessible | |
| ``` | |
| curl -I www.google.com | |
| ``` | |
| Expected output (partial): | |
| ``` | |
| HTTP/1.1 200 OK | |
| Transfer-Encoding: chunked | |
| Cache-Control: private | |
| Connection: keep-alive | |
| ``` | |
| 1.3 After confirming the above step is OK, start the VLM node | |
| ``` | |
| ros2 run g0_vlm_node vlm_main | |
| ``` | |
| *If using the qwen model inference: | |
| ``` | |
| unset http_proxy | |
| unset https_proxy | |
| unset all_proxy | |
| ros2 run g0_vlm_node vlm_main -- --use-qwen | |
| ``` | |
| ## 3- What you expect | |
| - VLM receives a Send request output, e.g., | |
| ``` | |
| 2025-11-05 07:40:33.230 | INFO | g0_vlm_node.vlm_main:vlm_processor1:153 - One hp successfully processed: ε°εε‘η½η¨ε³ζζΎε°ζηδΈ -> [Low]: Pick up the coffee can with the right hand and place it on the tray.! | |
| ``` | |
| - VLM receives a confirm request, e.g., | |
| ``` | |
| 2025-11-05 07:40:47.641 | INFO | g0_vlm_node.vlm_main:vlm_processor2:169 - One hp_ successfully sent to VLA: [Low]: Pick up the coffee can with the right hand and place it on the tray.! | |
| ``` | |