Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string

DC-GR00T β€” Demo-Conditioned GR00T Adapter (GENESIS)

⚠️ Under Active Development This checkpoint is a research preview. The DC-GR00T manipulation pipeline is still being actively developed and validated. Results and APIs may change without notice. Use with caution in production.

Part of the GENESIS research framework: video-conditioned robot learning.

Paper: PhysicalAgent: Towards General Cognitive Robotics with Foundation World Models

Code: github.com/jeffrinsam/GENESIS β†’ part2_manipulation/

Model Description

DC-GR00T is a Demo-Conditioned extension of GR00T N1.6. Instead of language instructions, it accepts a reference video of a manipulation task and extracts a task embedding that conditions the DiT action head.

This repository contains a LoRA fine-tuning adapter (PEFT) trained on Unitree G1 teleop demonstrations. Load it on top of the base nvidia/GR00T-N1.6-3B model.

Architecture additions over GR00T N1.6:

  • Demo encoder: SigLIP ViT-B/16 (224Γ—224) per-frame β†’ temporal transformer β†’ perceiver resampler β†’ task embedding [B, 16, 768]
  • Task cross-attention: Injects task embedding into DiT action head at every block
  • LoRA: r=8, Ξ±=16, applied to q/k/v/o/gate/up/down_proj layers of the language model

Target robot: Unitree G1 (43-DOF action space: arms, torso, hands, legs)

Current Status

Component Status
Demo encoder Stable
LoRA adapter (this repo) Research preview β€” training on ~5k steps
Closed-loop real robot eval In progress
Full training pipeline Under development

The checkpoint was trained for 4500–5000 steps on Unitree G1 teleop data. Full validation across manipulation tasks is ongoing.

Usage

Requires the dc_groot conda environment from the GENESIS repo. See part2_manipulation/README.md.

from peft import PeftModel
from gr00t.model.demo_conditioned.dc_gr00t import DCGr00t

# Load base model
base_model = DCGr00t.from_pretrained("nvidia/GR00T-N1.6-3B")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "JeffrinSam/genesis-dc-groot-adapter")
model = model.merge_and_unload()  # optional: merge for faster inference

Or via the GENESIS inference script:

conda activate dc_groot
cd GENESIS
python part2_manipulation/inference.py \
  --adapter JeffrinSam/genesis-dc-groot-adapter \
  --demo_video reference.mp4 \
  --robot unitree_g1

Adapter Details

Parameter Value
Base model nvidia/GR00T-N1.6-3B
PEFT type LoRA
Rank (r) 8
Alpha (Ξ±) 16
Dropout 0.05
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Adapter size ~29 MB
Training steps 5,000
Hardware NVIDIA RTX 5090 32 GB

Citation

@article{lykov2025physicalagent,
  title     = {PhysicalAgent: Towards General Cognitive Robotics with Foundation World Models},
  author    = {Lykov, Artem and Sam, Jeffrin and Nguyen, Hung Khang and others},
  journal   = {arXiv preprint arXiv:2509.13903},
  year      = {2025}
}

Please also cite the base model:

@article{nvidia2025groot,
  title   = {GR00T N1: An Open Foundation Model for Generalist Humanoid Robots},
  author  = {NVIDIA et al.},
  year    = {2025},
  url     = {https://huggingface.co/nvidia/GR00T-N1.6-3B}
}

License

Apache 2.0. The base model (nvidia/GR00T-N1.6-3B) is subject to NVIDIA's license β€” check its model card before use.

Downloads last month
-
Video Preview
loading

Model tree for JeffrinSam/genesis-dc-groot-adapter

Adapter
(1)
this model

Paper for JeffrinSam/genesis-dc-groot-adapter