Timsty commited on
Commit
26223d4
·
verified ·
1 Parent(s): 2601235

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -73
README.md CHANGED
@@ -1,29 +1,34 @@
1
  ---
2
- license: apache-2.0
3
  pipeline_tag: robotics
4
- library_name: transformers
 
 
 
 
5
  ---
6
 
7
  # Mixture of Horizons in Action Chunking
8
 
9
- This repository hosts the official models and code for the paper:
10
- [**Mixture of Horizons in Action Chunking**](https://huggingface.co/papers/2511.19433)
 
11
 
12
- Project Page: https://timsty1.github.io/moh/
13
 
14
- Code Repository: https://github.com/Timsty1/MixtureOfHorizons/tree/main
 
 
15
 
16
  ## Introduction
17
- Vision-language-action (VLA) models have shown remarkable capabilities in robotic manipulation, but their performance is sensitive to the **action chunk length** used during training, termed **horizon**. This paper proposes a **mixture of horizons (MoH)** strategy to mitigate the inherent trade-off between long-term foresight and short-term precision observed with fixed horizons. MoH rearranges action chunks into segments with different horizons, processes them in parallel with a shared action transformer, and fuses outputs. This approach allows MoH to exploit both long-term foresight and short-term precision jointly within a single model, improving performance and generalizability with minimal overhead. MoH also enables dynamic inference with adaptive horizons, achieving higher throughput while preserving superior performance.
18
 
19
  <div align="center">
20
  <table border="0" cellspacing="0" cellpadding="0">
21
  <tr>
22
  <td align="center" width="50%">
23
- <img src="https://huggingface.co/Timsty/mixture_of_horizons/resolve/main/figure/study_of_horizons_pi0.png" alt="Trade-off Effect" width="100%">
24
  </td>
25
  <td align="center" width="50%">
26
- <img src="https://huggingface.co/Timsty/mixture_of_horizons/resolve/main/figure/intro_motivation_v2.png" alt="Mixture of Horizons" width="100%">
27
  </td>
28
  </tr>
29
  <tr>
@@ -37,76 +42,27 @@ Vision-language-action (VLA) models have shown remarkable capabilities in roboti
37
  </table>
38
  </div>
39
 
40
- ## Quick Start
41
-
42
- ### 1. Environment Setup
43
-
44
- Clone the repository and set up the conda environment:
45
-
46
- ```bash
47
- git clone git@github.com:Timsty1/MixtureOfHorizons.git
48
- conda create -n moh -y python=3.10
49
- conda activate moh
50
- pip install uv
51
- cd MixtureOfHorizons
52
- uv pip install -r requirements.txt
53
- pip install packages/libero
54
- pip install packages/openpi-client
55
- ```
56
-
57
- ### 2. Modify Transformers Library
58
-
59
- This implementation requires modifying the `transformers` library to support PyTorch-type $\pi$ series models, which rely on *gemma*, *paligemma*, and *siglip*.
60
-
61
- First, locate your conda environment path:
62
- ```bash
63
- conda info --base
64
- ```
65
- Then, copy the provided files to the transformers library directory (replace `YOUR_CONDA_DIR` with the path found above):
66
- ```bash
67
- cp -r ./src/openpi/models_pytorch/transformers_replace/* YOUR_CONDA_DIR/envs/moh/lib/python3.10/site-packages/transformers/
68
- ```
69
-
70
- ### 3. Inference with Code
71
- You can use our provided "eagenerate" for speedup generation just like using 'generate' from Hugging Face. Here is an example.
72
-
73
- ```python
74
- import torch
75
- from eagle.model.ea_model import EaModel
76
- from fastchat.model import get_conversation_template
77
-
78
- # Replace with paths to your base model and EAGLE model checkpoints
79
- # Example: base_model_path = "lmsys/vicuna-13b-v1.3", EAGLE_model_path = "Timsty/mixture_of_horizons"
80
- base_model_path = "path/to/your/base_model"
81
- EAGLE_model_path = "path/to/your/eagle_model"
82
-
83
- model = EaModel.from_pretrained(
84
- base_model_path=base_model_path,
85
- ea_model_path=EAGLE_model_path,
86
- torch_dtype=torch.float16,
87
- low_cpu_mem_usage=True,
88
- device_map="auto",
89
- total_token=-1
90
- )
91
- model.eval()
92
- your_message="Hello"
93
- conv = get_conversation_template("vicuna") # Use the correct template for your base model
94
- conv.append_message(conv.roles[0], your_message)
95
- conv.append_message(conv.roles[1], None)
96
- prompt = conv.get_prompt()
97
- input_ids=model.tokenizer([prompt]).input_ids
98
- input_ids = torch.as_tensor(input_ids).cuda()
99
- output_ids=model.eagenerate(input_ids,temperature=0.5,max_new_tokens=512)
100
- output=model.tokenizer.decode(output_ids[0])
101
- print(output)
102
- ```
103
- **Note:** Vicuna, LLaMA2-Chat, and LLaMA3-Instruct are both chat models. You need to use the correct chat template, otherwise it will cause abnormal output from the model and affect the performance of EAGLE.
104
 
105
  ## ❤️ Acknowledgment
106
 
107
  We express our gratitude to [OpenPi](https://github.com/Physical-Intelligence/openpi/tree/main), [LIBERO](https://github.com/Lifelong-Robot-Learning/LIBERO), and [RoboTwin](https://robotwin-platform.github.io/) for their open-source contributions.
108
 
109
  ## 📝 Citation
 
110
  If you feel that this paper, models, or codes are helpful, please cite our paper, thanks for your support!
111
 
112
  ```bibtex
 
1
  ---
 
2
  pipeline_tag: robotics
3
+ license: apache-2.0
4
+ tags:
5
+ - reinforcement-learning
6
+ - robotic-manipulation
7
+ - action-chunking
8
  ---
9
 
10
  # Mixture of Horizons in Action Chunking
11
 
12
+ This repository hosts the official implementation of **Mixture of Horizons (MoH)**, introduced in the paper [Mixture of Horizons in Action Chunking](https://huggingface.co/papers/2511.19433).
13
+
14
+ Vision-language-action (VLA) models for robotic manipulation are highly sensitive to the chosen **action chunk length**, termed **horizon** in this work. A fixed horizon presents an inherent trade-off: longer horizons offer superior global foresight but compromise fine-grained accuracy, while shorter ones provide precise local control but struggle with long-term tasks.
15
 
16
+ To address this challenge, we propose **Mixture of Horizons (MoH)**, a novel, plug-and-play strategy that fuses multiple horizons within a single policy. MoH processes action chunks in parallel segments with different horizons and integrates their outputs. This approach simultaneously leverages long-term foresight and short-term precision with minimal overhead, and enables **Dynamic Inference** through cross-horizon consensus for enhanced efficiency and robustness in complex robotic tasks.
17
 
18
+ - 📄 [Paper](https://huggingface.co/papers/2511.19433)
19
+ - 📝 [Project Page](https://timsty1.github.io/moh/)
20
+ - 💻 [Code](https://github.com/Timsty1/MixtureOfHorizons/tree/main)
21
 
22
  ## Introduction
 
23
 
24
  <div align="center">
25
  <table border="0" cellspacing="0" cellpadding="0">
26
  <tr>
27
  <td align="center" width="50%">
28
+ <img src="https://github.com/Timsty1/MixtureOfHorizons/raw/main/figure/study_of_horizons_pi0.png" alt="Trade-off Effect" width="100%">
29
  </td>
30
  <td align="center" width="50%">
31
+ <img src="https://github.com/Timsty1/MixtureOfHorizons/raw/main/figure/intro_motivation_v2.png" alt="Mixture of Horizons" width="100%">
32
  </td>
33
  </tr>
34
  <tr>
 
42
  </table>
43
  </div>
44
 
45
+ <br>
46
+
47
+ * **Mitigates Trade-off**: Addresses the inherent trade-off between long-term foresight and short-term precision induced by single action chunk horizons.
48
+ * **Plug-and-Play**: Easily integrates into existing full-attention action modules with minimal training or inference overhead.
49
+ * **Dynamic Inference**: Achieves higher efficiency and robustness by selecting stable actions through cross-horizon consensus.
50
+
51
+ #### More results on LIBERO
52
+ <div align="center">
53
+ <img src="https://github.com/Timsty1/MixtureOfHorizons/raw/main/figure/libero_main.jpg" width="90%" />
54
+ </div>
55
+
56
+ ## Usage
57
+
58
+ For detailed instructions on environment setup, training, and evaluation, please refer to the [GitHub repository](https://github.com/Timsty1/MixtureOfHorizons/tree/main).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
 
60
  ## ❤️ Acknowledgment
61
 
62
  We express our gratitude to [OpenPi](https://github.com/Physical-Intelligence/openpi/tree/main), [LIBERO](https://github.com/Lifelong-Robot-Learning/LIBERO), and [RoboTwin](https://robotwin-platform.github.io/) for their open-source contributions.
63
 
64
  ## 📝 Citation
65
+
66
  If you feel that this paper, models, or codes are helpful, please cite our paper, thanks for your support!
67
 
68
  ```bibtex