BEAST: B-Spline Encoded Action Sequences Tokenizer

BEAST is an action tokenizer that converts continuous robot action sequences into discrete tokens using B-splines. It enables efficient trajectory compression for imitation learning by representing smooth robot motions as compact token sequences.

Installation

Install the required dependencies:

pip install torch numpy matplotlib einops transformers

Note: CUDA is recommended for optimal performance, but CPU is also supported by setting device="cpu".

Quick Start

from transformers import AutoProcessor
import torch

# Initialize the BEAST processor with configuration parameters:
# - num_dof: degrees of freedom (3 for 3D trajectories like x, y, z)
# - num_basis: number of B-spline basis functions used for trajectory representation
# - seq_len: length of the trajectory sequence (number of time steps)
# - degree_p: degree of the B-spline polynomial (3 = cubic spline)
# - device: computation device ('cpu' or 'cuda')
beast = AutoProcessor.from_pretrained(
    "zhouhongyi/beast",
    trust_remote_code=True,
    num_dof = 3,
    num_basis = 20,
    seq_len = 50,
    degree_p = 3,
    device = 'cpu'
)

# Create random trajectory data: 10 trajectories, each with 50 time steps, 3 dimensions
trajectories = torch.randn(10, 50, 3)

# Encode trajectories into discrete tokens
# update_bounds=True allows the processor to adaptively update quantization bounds
tokens = beast.encode_discrete(trajectories, update_bounds=True)
print(f"Encoded tokens shape: {tokens.shape}")

# Decode tokens back to continuous trajectories
reconstructed_trajectories = beast.decode_discrete(tokens)
print(f"Reconstructed trajectories shape: {reconstructed_trajectories.shape}")

# Calculate mean squared error to measure reconstruction quality
mse_loss = torch.mean((trajectories - reconstructed_trajectories) ** 2)
print(f"MSE Loss: {mse_loss.item()}")

# Visualize the reconstruction error for analysis
beast.visualize_reconstruction_error_discrete(trajectories)

Continuous Encoding

For integration with continuous generative models:

# Encode to normalized continuous parameters [-1, 1]
params = beast.encode_continuous(trajectories, update_bounds=True)

# Decode back
reconstructed = beast.decode_continuous(params)

Parameters

Parameter	Description	Default
`num_dof`	Total degrees of freedom (robot joints + gripper)	7
`num_basis`	Number of B-spline basis functions. Higher values improve reconstruction fidelity but produce more tokens	10
`seq_len`	Trajectory sequence length (number of timesteps)	50
`vocab_size`	Discrete vocabulary size (256 = 8-bit tokens)	256
`degree_p`	B-spline polynomial degree. Higher degrees produce smoother curves (3=cubic, 4=quartic)	4
`device`	Torch device (`"cuda"` or `"cpu"`)	`"cuda"`
`gripper_zero_order`	Use piecewise-constant (degree 0) splines for gripper DOFs. Useful for binary gripper states	`False`
`gripper_dof`	Number of gripper DOFs, assumed to be in the end. Only used when `gripper_zero_order=True`	1
`enforce_init_pos`	Enforce initial position constraint during decoding	`False`

Token Count

The total number of tokens per trajectory is: num_basis * num_dof

For example, with default settings (10 basis, 7 DOF): 70 tokens per trajectory.

API Reference

Encoding Methods

encode_discrete(trajs, update_bounds=True)

Input: Trajectories tensor [batch, seq_len, num_dof]
Output: Discrete tokens [batch, num_basis * num_dof] in range [0, vocab_size-1]
update_bounds: Whether to update internal weight bounds from this batch

encode_continuous(trajs, update_bounds=True)

Input: Trajectories tensor [batch, seq_len, num_dof]
Output: Normalized parameters [batch, num_basis * num_dof] in range [-1, 1]

Decoding Methods

decode_discrete(tokens, times=None, init_pos=None)

Input: Discrete tokens [batch, num_basis * num_dof]
Output: Reconstructed trajectories [batch, seq_len, num_dof]
times: Custom time points (optional, defaults to seq_len uniform points)
init_pos: Initial position constraint (optional)

decode_continuous(params, times=None, init_pos=None)

Input: Normalized parameters [batch, num_basis * num_dof]
Output: Reconstructed trajectories [batch, seq_len, num_dof]

Utility Methods

compute_reconstruction_error(raw_traj)

Compute MSE between original and reconstructed trajectory

visualize_reconstruction_error_discrete(raw_traj) / visualize_reconstruction_error_continuous(raw_traj)

Plot original vs reconstructed trajectories for visual comparison

Uses

Intended Use Cases

Robot Imitation Learning: Compress continuous demonstration trajectories into discrete tokens for language model-based policy learning
Trajectory Compression: Reduce memory footprint of robot demonstration datasets while preserving motion quality
Action Tokenization: Enable transformer-based models to process robot actions as discrete token sequences

Citation

If you use BEAST in your research, please cite:

BibTeX:

@inproceedings{
zhou2025beast,
title={{BEAST}: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning},
author={Hongyi Zhou and Weiran Liao and Xi Huang and Yucheng Tang and Fabian Otto and Xiaogang Jia and Xinkai Jiang and Simon Hilber and Ge Li and Qian Wang and {\"O}mer Erdin{\c{c}} Ya{\u{g}}murlu and Nils Blank and Moritz Reuss and Rudolf Lioutikov},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://openreview.net/forum?id=rQCl1sf62w}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support