--- datasets: - behavior-1k/2025-challenge-demos - IliaLarchenko/behavior_224_rgb license: apache-2.0 tags: - robotics pipeline_tag: robotics --- This is an intermediate checkpoint that we used in our [1st place solution of the 2025 BEHAVIOR Challenge](https://github.com/IliaLarchenko/behavior-1k-solution). This checkpoint is obtained by training the policy on 50 tasks simultaneously for ~2 weeks. It is not part of our [final submission](https://huggingface.co/IliaLarchenko/behavior_submission). Also, we didn't run the whole evaluation of this checkpoint, but we would expect it to achieve a 15-20% q-score. Paper: [Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR Challenge](https://huggingface.co/papers/2512.06951) Project page: https://behavior.stanford.edu/challenge/ Code/GitHub Repository: [IliaLarchenko/behavior-1k-solution](https://github.com/IliaLarchenko/behavior-1k-solution) arXiv: [2512.06951](https://arxiv.org/abs/2512.06951) The [final submission checkpoints](https://huggingface.co/IliaLarchenko/behavior_submission) ## Sample Usage This section provides a quick overview of how to get started with the model, adapted from the [GitHub repository](https://github.com/IliaLarchenko/behavior-1k-solution). ### Installation ```bash # Clone with submodules (includes openpi and BEHAVIOR-1K) git clone --recurse-submodules https://github.com/ilialarchenko/behavior-1k-solution.git cd behavior-1k-solution # Run setup script (installs uv, dependencies, and sets up environment) bash setup_remote.sh ``` ### Dataset Preparation Download the official BEHAVIOR-1K dataset from HuggingFace: ```bash # Login to HuggingFace (need to avoid request rate limit errors) uv run huggingface-cli login # Download the full dataset (~2TB) uv run python - <<'PY' from huggingface_hub import snapshot_download snapshot_download( repo_id="behavior-1k/2025-challenge-demos", repo_type="dataset", local_dir="./data/behavior_dataset", local_dir_use_symlinks=False ) PY ``` **Alternative**: Use the resized RGB-only dataset (224×224, ~260GB) for faster training: ```bash uv run python - <<'PY' from huggingface_hub import snapshot_download snapshot_download( repo_id="IliaLarchenko/behavior_224_rgb", repo_type="dataset", local_dir="./data/behavior_224_rgb", local_dir_use_symlinks=False ) PY ``` ### Pre-training Setup Compute dataset statistics and train FAST tokenizer: ```bash # Compute normalization statistics with correlation matrix uv run scripts/compute_norm_stats.py --config-name pi_behavior_b1k_fast --correlation # Train FAST tokenizer for action discretization uv run scripts/train_fast_tokenizer.py \ --config-name pi_behavior_b1k_fast \ --encoded-dims="0:6,7:23" \ --vocab-size=1024 ``` ### Training **Single GPU Training**: ```bash uv run scripts/train.py pi_behavior_b1k_fast \ --batch_size=16 \ --num_train_steps=200000 \ --save_interval=2000 \ --keep_period=10000 \ --log_interval=100 ``` **Multi-GPU Training**: ```bash uv run scripts/train.py pi_behavior_b1k_fast \ --batch_size=2048 \ --num_train_steps=200000 \ --fsdp_devices=8 \ --save_interval=250 \ --keep_period=4000 \ --log_interval=25 ``` ### Evaluation Start the policy server: ```bash uv run scripts/serve_b1k.py policy:checkpoint \ --policy.config pi_behavior_b1k_fast \ --policy.dir /path/to/checkpoint ``` In a separate terminal, [run evaluation](https://behavior.stanford.edu/challenge/baselines.html) (requires BEHAVIOR-1K environment): ```bash python BEHAVIOR-1K/omnigibson/learning/eval.py \ log_path=./eval_logs \ policy=websocket \ task.name=make_microwave_popcorn \ model.host=localhost \ eval_instance_ids="[0,1,2,3]" ``` ## Citation If you find this work useful, please cite: ```bibtex @misc{larchenko2025behavior, title={Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR Challenge}, author={Ilia Larchenko and Gleb Zarin and Akash Karnatak}, year={2025}, eprint={2512.06951}, archivePrefix={arXiv}, primaryClass={cs.RO}, url={https://arxiv.org/abs/2512.06951}, } ```