RoboCasa_Env / STATE_ACTION_SPEC.md
Whalswp's picture
Upload folder using huggingface_hub
d893a8a verified
# RoboCasa State / Action ๋ช…์„ธ
> ๊ทผ๊ฑฐ ํŒŒ์ผ: `env.py`, `gym_wrapper.py` (`PandaOmronKeyConverter`), `robosuite/controllers/parts/arm/osc.py`, `robosuite/controllers/config/robots/default_pandaomron.json`
---
## State (์ด 16์ฐจ์›)
`env.py: convert_state()` ๊ธฐ์ค€์œผ๋กœ concatenate๋จ.
| ์ธ๋ฑ์Šค | ์ฐจ์› | ํ‚ค | absolute / relative | ํ‘œํ˜„ |
|--------|------|----|---------------------|------|
| 0~2 | 3 | `state.base_position` | **absolute** | xyz |
| 3~6 | 4 | `state.base_rotation` | **absolute** | **Quaternion** (`robot0_base_quat`) |
| 7~9 | 3 | `state.end_effector_position_relative` | **relative** (base โ†’ EE) | xyz |
| 10~13 | 4 | `state.end_effector_rotation_relative` | **relative** (base โ†’ EE) | **Quaternion** (`robot0_base_to_eef_quat`) |
| 14~15 | 2 | `state.gripper_qpos` | โ€” | joint position |
---
## Action (์ด 12์ฐจ์›)
`env.py: convert_action()` ๊ธฐ์ค€์œผ๋กœ ๋ถ„ํ•ด๋จ.
| ์ธ๋ฑ์Šค | ์ฐจ์› | ํ‚ค | ์„ค๋ช… |
|--------|------|----|------|
| 0~3 | 4 | `action.base_motion` | ๋ฒ ์ด์Šค ์ด๋™ (์•„๋ž˜ ์ฐธ๊ณ ) |
| 4 | 1 | `action.control_mode` | ์ œ์–ด ๋ชจ๋“œ ์Šค์œ„์น˜ (์•„๋ž˜ ์ฐธ๊ณ ) |
| 5~7 | 3 | `action.end_effector_position` | EE delta position, **base frame ๊ธฐ์ค€** |
| 8~10 | 3 | `action.end_effector_rotation` | EE delta rotation, **base frame ๊ธฐ์ค€, axis-angle** |
| 11 | 1 | `action.gripper_close` | ๊ทธ๋ฆฌํผ ๋‹ซ๊ธฐ (0.5 threshold โ†’ binary) |
### base_motion (4์ฐจ์›) ์ƒ์„ธ
| ์ธ๋ฑ์Šค | ๋Œ€์ƒ | controller type | ์„ค๋ช… |
|--------|------|-----------------|------|
| 0~2 | `robot0_base` | `JOINT_VELOCITY` | ๋ชจ๋ฐ”์ผ ๋ฒ ์ด์Šค x์†๋„ / y์†๋„ / yaw์†๋„ |
| 3 | `robot0_torso` | `JOINT_POSITION` | ๋ชธํ†ต ์ˆ˜์ง ๋ฆฌํ”„ํŠธ joint position (โ‰ˆ ๋†’์ด) |
### control_mode (1์ฐจ์›) ์ƒ์„ธ
| ๊ฐ’ | base_mode | ๋™์ž‘ |
|----|-----------|------|
| < 0.5 | -1 | **Arm mode** โ€” ๋ฒ ์ด์Šค ๊ณ ์ •, ํŒ”๋กœ ์กฐ์ž‘ (goal: `achieved` ๊ธฐ์ค€) |
| โ‰ฅ 0.5 | +1 | **Base mode** โ€” ๋ฒ ์ด์Šค ์ด๋™, ํŒ” ๋ชฉํ‘œ ์œ ์ง€ (goal: `desired` ๊ธฐ์ค€) |
---
## EE Rotation: axis-angle์„ ์“ฐ๋Š” ์ด์œ 
OSC controller(`osc.py`)๋Š” rotation input์„ `Rotation.from_rotvec()` ์œผ๋กœ ํ•ด์„ โ†’ **axis-angle ๊ณ ์ •**.
RPY(Euler angle) ๋Œ€์‹  axis-angle์„ ์“ฐ๋Š” ์ด์œ :
1. **Gimbal lock ์—†์Œ** โ€” RPY๋Š” ํŠน์ • ์ž์„ธ์—์„œ ๋‘ ์ถ•์ด ๊ฒน์ณ DOF๋ฅผ ์žƒ๋Š” singularity ๋ฐœ์ƒ. EE๋Š” ์ž์œ  ํšŒ์ „ํ•˜๋ฏ€๋กœ ์‹ค์ œ ๋ฌธ์ œ๊ฐ€ ๋จ.
2. **Delta ์ œ์–ด์— ์ž์—ฐ์Šค๋Ÿฌ์›€** โ€” "์ด ์ถ• ๋ฐฉํ–ฅ์œผ๋กœ ฮธ๋งŒํผ ํšŒ์ „" ์˜๋ฏธ๊ฐ€ ์ง๊ด€์ ์ด๊ณ  ๋ณด๊ฐ„์ด smooth. RPY delta๋Š” ์ˆœ์„œ ์˜์กด์„ฑ(rollโ†’pitchโ†’yaw) ๋•Œ๋ฌธ์— ํ•ฉ์„ฑ์ด ๋ณต์žกํ•จ.
3. **ํฌ๊ธฐ = ํšŒ์ „๋Ÿ‰** โ€” ๋ฒกํ„ฐ norm์ด ํšŒ์ „๊ฐ์ด๋ผ output clipping์ด ์ž์—ฐ์Šค๋Ÿฌ์›€. (`output_max: [0.5, 0.5, 0.5]` rad)
> RPY ์ž…๋ ฅ์€ ์ฝ”๋“œ์ƒ ์ง€์›ํ•˜์ง€ ์•Š์Œ. ํ•„์š”ํ•˜๋ฉด wrapper์—์„œ ๋ณ€ํ™˜ ํ•„์š”:
> ```python
> from scipy.spatial.transform import Rotation
> axis_angle = Rotation.from_euler('xyz', rpy).as_rotvec()
> ```
---
## ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์—ฐ๊ฒฐ ํ๋ฆ„
```
policy output (12-dim)
โ†“ convert_action() [env.py]
action dict (base_motion, control_mode, EE_pos, EE_rot, gripper_close)
โ†“ unmap_action() [gym_wrapper.py]
{
robot0_right: concat(EE_pos[3], EE_rot[3]) โ†’ OSC_POSE controller
robot0_right_gripper: threshold(gripper_close, 0.5) โ†’ -1 or +1
robot0_base: base_motion[0:3] โ†’ JOINT_VELOCITY controller
robot0_torso: base_motion[3:4] โ†’ JOINT_POSITION controller
robot0_base_mode: threshold(control_mode, 0.5) โ†’ -1 or +1
}
โ†“ env.step() [robosuite]
MuJoCo simulation
```