YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Maniskill_gen_new — Data Collection for Compositional Generalization

This repository contains the data collection codebase used to generate training data for studying compositional language grounding in robotic manipulation. It is built on top of ManiSkill and provides a suite of pairwise conflict experiments designed to measure which linguistic factors a policy prioritizes when given conflicting instructions.

Design Overview

Core Concept: Pairwise Conflict Experiments

Each experiment places two linguistic factors in conflict. For example, in verb_size, the instruction might be "Push the smaller cube" but the verb (push) and size (smaller) cue different target objects. The robot must pick one factor to follow. By varying which factor is "seen" during training (via the data collection strategy), we can bias the policy toward different grounding behaviors.

The custom ManiSkill environment VerbObjectColor-v1 (in mani_skill/envs/tasks/tabletop/verb_object_color_env.py) implements all 10 pairwise experiments with a 6-factor vocabulary:

Factor	Values
Verb	`lift`, `grasp`, `push`, `pull`, `rotate`, `slide`
Color	`red`, `yellow`, `blue`, `orange`, `green`, `black`
Shape	`cube`, `sphere`, `cup`, `car`, `pyramid`, `star`
Spatial	`left`, `right`, `middle`, `front`, `behind`
Size	`small`, `large`, `smaller`, `larger`, `smallest`, `largest`

10 Pairwise Experiment Types

Experiment	Factor 1	Factor 2	Fixed
`verb_color`	verb	color	shape=cube
`verb_object`	verb	shape	color=red
`color_object`	color	shape	verb=lift
`verb_size`	verb	size	color=red, shape=cube
`verb_spatial`	verb	spatial	color=red, shape=cube
`color_size`	color	size	verb=lift, shape=cube
`color_spatial`	color	spatial	verb=lift, shape=cube
`spatial_size`	spatial	size	verb=lift, color=red, shape=cube
`size_object`	size	shape	verb=lift, color=red
`spatial_object`	spatial	shape	verb=lift, color=red

The factor grid is a 6×6 matrix. Training coverage is controlled by --factor-count in {6, 12, 18}, selecting that many (factor1_idx, factor2_idx) cells to include in training demos.

All-Factor Experiment

collection_strategy/all_factor/ implements the full 5-factor experiment: 6 verbs × 6 colors × 6 shapes × 5 spatials × 4 sizes = 4320 cells (sampled via f50 strategies covering 50 cells).

Data Collection Strategies

Three families of strategies control which cells in the factor grid are included in training data:

Stair (Ours)

A staircase pattern that systematically traverses the diagonal of the factor grid. Each new cell maximally increases one factor's breadth while maintaining full coverage of the other. This provides structured curriculum-style coverage with high compositional exposure per episode.

stair (Maniskill_gen_new convention, verb-first): diagonal staircase
stair1: alternative anchor (shape/color-first anchor)

L-Random (`Lrandom`)

A hybrid strategy combining a fixed L-shaped spine (high-information anchor cells) with random fills. Provides structured core coverage with randomized diversity.

Random

Uniform random sampling of cells from the 6×6 grid. Baseline strategy.

StairRandom

Staircase cells combined with additional random cells. Experimental hybrid.

Repository Structure

collection_strategy/          <- Core data collection code
├── lib/
│   ├── pairwise_factor_patterns.py   # STAIR, LRANDOM, L1 index definitions
│   ├── pairwise_strategies.py        # Strategy type enum
│   ├── pairwise_task_language.py     # Instruction string generation
│   ├── all_factor_support_f50.py     # All-factor f50 strategy definitions
│   └── all_factor_scene.py           # All-factor scene setup
├── collect_pairwise_attribute.py     # Main collector: verb_size, verb_spatial,
│                                     #   color_size, spatial_size
├── collect_pairwise.py               # Legacy: verb_color, verb_object, color_object
├── collect_verb_color_object.py      # VerbObjectColor triple-factor collector
├── collect_all_factor.py             # All-5-factor collector
├── convert_maniskill_to_lerobot.py   # Convert H5 demos -> LeRobot dataset
├── build_attribute_task_map.py       # Build task language map from filenames
├── {verb_color, verb_object, color_object, ...}/
│   ├── collect.py                    # Thin wrapper calling main collector
│   └── collect_convert.sh            # Full pipeline: collect -> replay -> LeRobot
└── all_factor/
    ├── collect.py
    └── collect_convert.sh

conflict_experiment/          <- Conflict-eval specific collection
├── collect_conflict.py               # Conflict eval data collector
├── collect_conflict_attribute.py     # Attribute-version collector
└── lib/conflict_sampling.py          # Conflict pair sampling logic

scripts/
├── run_verb_color_shape_motion_planning.py   # Core motion planning runner
└── ...                               # Utility scripts

asset/                        <- Custom 3D object meshes
├── cup.obj                   # Cup mesh
├── car.obj                   # Car mesh
└── can.obj                   # Can mesh

mani_skill/                   <- Modified ManiSkill framework
└── envs/tasks/tabletop/
    └── verb_object_color_env.py      # Custom VerbObjectColor-v1 environment

Pipeline

Each experiment follows a 3-stage pipeline:

1. COLLECT   ->  Motion-planning demos saved as .h5 (no RGB obs)
2. REPLAY    ->  Replay trajectories with RGB rendering -> *_rgb.h5
3. CONVERT   ->  Convert to LeRobot parquet format -> HuggingFace dataset

The collect_convert.sh in each experiment folder automates all three stages.

Quick Start

cd /path/to/Maniskill_gen_new

# Collect 200 demos for verb_size with Staircase-f18 strategy
STRATEGY=stair FACTOR_COUNT=18 NUM_DEMOS=200 \
  bash collection_strategy/verb_size/collect_convert.sh /path/to/output

# Collect only (no replay/convert):
python -m collection_strategy.collect_pairwise_attribute \
  --experiment verb_size \
  --strategy stair \
  --factor-count 18 \
  --num-demos 200 \
  --record-base /path/to/output

All-Factor Collection

STRATEGY=Lrandom_pure NUM_DEMOS=800 \
  bash collection_strategy/all_factor/collect_convert.sh /path/to/output

Factor Grid Convention (Maniskill_gen_new)

Index convention for pairwise experiments:

Experiment	Row (factor1_idx)	Col (factor2_idx)
`verb_color`	color_idx	verb_idx
`verb_object`	shape_idx	verb_idx
`color_object`	color_idx	shape_idx
`verb_size`	size_idx	verb_idx
`verb_spatial`	spatial_idx	verb_idx
`color_size`	color_idx	size_idx
`spatial_size`	spatial_idx	size_idx

Factor counts (Maniskill_gen_new convention):

stair f6 = 5 cells, f12 = 10 cells, f18 = 15 cells
Lrandom f6 = 4 cells, f12 = 9 cells, f18 = 14 cells
random f6 = 3 cells (2x2 subgrid), f12 = 6 cells, f18 = 9 cells

Setup

# Install ManiSkill (modified version)
conda create -n maniskill39 python=3.9
conda activate maniskill39
pip install -e .

# For LeRobot conversion
conda activate openpi  # or any env with lerobot
pip install lerobot

Evaluation code: yqi19/evaluation_pi0_pi05
Training datasets: yqi19/data_05_17
ManiSkill: haosulab/ManiSkill

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

yqi19
/

Maniskill_gen_new

Maniskill_gen_new — Data Collection for Compositional Generalization

Design Overview

Core Concept: Pairwise Conflict Experiments

10 Pairwise Experiment Types

All-Factor Experiment

Data Collection Strategies

Stair (Ours)

L-Random (`Lrandom`)

Random

StairRandom

Repository Structure

Pipeline

Quick Start

All-Factor Collection

Factor Grid Convention (Maniskill_gen_new)

Setup

Related

Maniskill_gen_new — Data Collection for Compositional Generalization

Design Overview

Core Concept: Pairwise Conflict Experiments

10 Pairwise Experiment Types

All-Factor Experiment

Data Collection Strategies

Stair (Ours)

L-Random (Lrandom)

Random

StairRandom

Repository Structure

Pipeline

Quick Start

All-Factor Collection

Factor Grid Convention (Maniskill_gen_new)

Setup

Related

L-Random (`Lrandom`)