RoboCasa_Env / STATE_ACTION_SPEC.md
Whalswp's picture
Upload folder using huggingface_hub
d893a8a verified

RoboCasa State / Action ๋ช…์„ธ

๊ทผ๊ฑฐ ํŒŒ์ผ: env.py, gym_wrapper.py (PandaOmronKeyConverter), robosuite/controllers/parts/arm/osc.py, robosuite/controllers/config/robots/default_pandaomron.json


State (์ด 16์ฐจ์›)

env.py: convert_state() ๊ธฐ์ค€์œผ๋กœ concatenate๋จ.

์ธ๋ฑ์Šค ์ฐจ์› ํ‚ค absolute / relative ํ‘œํ˜„
0~2 3 state.base_position absolute xyz
3~6 4 state.base_rotation absolute Quaternion (robot0_base_quat)
7~9 3 state.end_effector_position_relative relative (base โ†’ EE) xyz
10~13 4 state.end_effector_rotation_relative relative (base โ†’ EE) Quaternion (robot0_base_to_eef_quat)
14~15 2 state.gripper_qpos โ€” joint position

Action (์ด 12์ฐจ์›)

env.py: convert_action() ๊ธฐ์ค€์œผ๋กœ ๋ถ„ํ•ด๋จ.

์ธ๋ฑ์Šค ์ฐจ์› ํ‚ค ์„ค๋ช…
0~3 4 action.base_motion ๋ฒ ์ด์Šค ์ด๋™ (์•„๋ž˜ ์ฐธ๊ณ )
4 1 action.control_mode ์ œ์–ด ๋ชจ๋“œ ์Šค์œ„์น˜ (์•„๋ž˜ ์ฐธ๊ณ )
5~7 3 action.end_effector_position EE delta position, base frame ๊ธฐ์ค€
8~10 3 action.end_effector_rotation EE delta rotation, base frame ๊ธฐ์ค€, axis-angle
11 1 action.gripper_close ๊ทธ๋ฆฌํผ ๋‹ซ๊ธฐ (0.5 threshold โ†’ binary)

base_motion (4์ฐจ์›) ์ƒ์„ธ

์ธ๋ฑ์Šค ๋Œ€์ƒ controller type ์„ค๋ช…
0~2 robot0_base JOINT_VELOCITY ๋ชจ๋ฐ”์ผ ๋ฒ ์ด์Šค x์†๋„ / y์†๋„ / yaw์†๋„
3 robot0_torso JOINT_POSITION ๋ชธํ†ต ์ˆ˜์ง ๋ฆฌํ”„ํŠธ joint position (โ‰ˆ ๋†’์ด)

control_mode (1์ฐจ์›) ์ƒ์„ธ

๊ฐ’ base_mode ๋™์ž‘
< 0.5 -1 Arm mode โ€” ๋ฒ ์ด์Šค ๊ณ ์ •, ํŒ”๋กœ ์กฐ์ž‘ (goal: achieved ๊ธฐ์ค€)
โ‰ฅ 0.5 +1 Base mode โ€” ๋ฒ ์ด์Šค ์ด๋™, ํŒ” ๋ชฉํ‘œ ์œ ์ง€ (goal: desired ๊ธฐ์ค€)

EE Rotation: axis-angle์„ ์“ฐ๋Š” ์ด์œ 

OSC controller(osc.py)๋Š” rotation input์„ Rotation.from_rotvec() ์œผ๋กœ ํ•ด์„ โ†’ axis-angle ๊ณ ์ •.

RPY(Euler angle) ๋Œ€์‹  axis-angle์„ ์“ฐ๋Š” ์ด์œ :

  1. Gimbal lock ์—†์Œ โ€” RPY๋Š” ํŠน์ • ์ž์„ธ์—์„œ ๋‘ ์ถ•์ด ๊ฒน์ณ DOF๋ฅผ ์žƒ๋Š” singularity ๋ฐœ์ƒ. EE๋Š” ์ž์œ  ํšŒ์ „ํ•˜๋ฏ€๋กœ ์‹ค์ œ ๋ฌธ์ œ๊ฐ€ ๋จ.
  2. Delta ์ œ์–ด์— ์ž์—ฐ์Šค๋Ÿฌ์›€ โ€” "์ด ์ถ• ๋ฐฉํ–ฅ์œผ๋กœ ฮธ๋งŒํผ ํšŒ์ „" ์˜๋ฏธ๊ฐ€ ์ง๊ด€์ ์ด๊ณ  ๋ณด๊ฐ„์ด smooth. RPY delta๋Š” ์ˆœ์„œ ์˜์กด์„ฑ(rollโ†’pitchโ†’yaw) ๋•Œ๋ฌธ์— ํ•ฉ์„ฑ์ด ๋ณต์žกํ•จ.
  3. ํฌ๊ธฐ = ํšŒ์ „๋Ÿ‰ โ€” ๋ฒกํ„ฐ norm์ด ํšŒ์ „๊ฐ์ด๋ผ output clipping์ด ์ž์—ฐ์Šค๋Ÿฌ์›€. (output_max: [0.5, 0.5, 0.5] rad)

RPY ์ž…๋ ฅ์€ ์ฝ”๋“œ์ƒ ์ง€์›ํ•˜์ง€ ์•Š์Œ. ํ•„์š”ํ•˜๋ฉด wrapper์—์„œ ๋ณ€ํ™˜ ํ•„์š”:

from scipy.spatial.transform import Rotation
axis_angle = Rotation.from_euler('xyz', rpy).as_rotvec()

์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์—ฐ๊ฒฐ ํ๋ฆ„

policy output (12-dim)
    โ†“ convert_action()  [env.py]
action dict (base_motion, control_mode, EE_pos, EE_rot, gripper_close)
    โ†“ unmap_action()  [gym_wrapper.py]
{
  robot0_right:        concat(EE_pos[3], EE_rot[3])  โ†’ OSC_POSE controller
  robot0_right_gripper: threshold(gripper_close, 0.5) โ†’ -1 or +1
  robot0_base:         base_motion[0:3]              โ†’ JOINT_VELOCITY controller
  robot0_torso:        base_motion[3:4]              โ†’ JOINT_POSITION controller
  robot0_base_mode:    threshold(control_mode, 0.5)  โ†’ -1 or +1
}
    โ†“ env.step()  [robosuite]
MuJoCo simulation