SaiResearch
/

booster_soccer_models

 tags:
 - robotics
 - reinforcement_learning
+- humanoid
+- soccer
+- sai
+- mujoco
+---
+## Model Details
+### Model Description
+This repository hosts the **Booster Soccer Controller Suite** — a collection of reinforcement learning policies and controllers powering humanoid agents in the [**Booster Soccer Showdown**](https://competesai.com/competitions/cmp_xnSCxcJXQclQ).
+It contains:
+1. **Low-Level Controller (robot/):**
+   A proprioceptive policy for the **Lower T1** humanoid that converts high-level commands (forward, lateral, and yaw velocities) into joint angle targets.
+2. **Competition Policies (model/):**
+   High-level agents trained in SAI’s soccer environments that output those high-level commands for match-time play.
+- **Developed by:** ArenaX Labs
+- **License:** MIT
+- **Frameworks:** PyTorch · MuJoCo · Stable-Baselines3
+- **Environments:** Booster Gym / SAI Soccer tasks
+## Testing Instructions
+1. **Clone the repo**
+```bash
+git clone https://github.com/ArenaX-Labs/booster_soccer_showdown.git
+cd booster_soccer_showdown
+```
+2. **Create & activate a Python 3.10+ environment**
+```bash
+# any env manager is fine; here are a few options
+# --- venv ---
+python3 -m venv .venv
+source .venv/bin/activate  # Windows: .venv\Scripts\activate
+# --- conda ---
+# conda create -n booster-ssl python=3.11 -y && conda activate booster-ssl
+```
+3. **Install dependencies**
+```bash
+pip install -r requirements.txt
+```
+---
+### Teleoperation
+Booster Soccer Showdown supports keyboard teleop out of the box.
+```bash
+python booster_control/teleoperate.py \
+  --env LowerT1GoaliePenaltyKick-v0
+```
+**Default bindings (example):**
+* `W/S`: move forward/backward
+* `A/D`: move left/right
+* `Q/E`: rotate left/right
+* `L`: reset commands
+* `P`: reset environment
+---
+⚠️ **Note for macOS and Windows users**
+Because different renderers are used on macOS and Windows, you may need to adjust the **position** and **rotation** sensitivity for smooth teleoperation.
+Run the following command with the sensitivity flags set explicitly:
+```bash
+python booster_control/teleoperate.py \
+  --env LowerT1GoaliePenaltyKick-v0 \
+  --pos_sensitivity 1.5 \
+  --rot_sensitivity 1.5
+```
+(Tune `--pos_sensitivity` and `--rot_sensitivity` as needed for your setup.)
+---
+### Training
+We provide a minimal reinforcement learning pipeline for training agents with **Deep Deterministic Policy Gradient (DDPG)** in the Booster Soccer Showdown environments in the `training_scripts/` folder. The training stack consists of three scripts:
+#### 1) `ddpg.py`
+Defines the **DDPG_FF model**, including:
+* Actor and Critic neural networks with configurable hidden layers and activation functions.
+* Target networks and soft-update mechanism for stability.
+* Training step implementation (critic loss with MSE, actor loss with policy gradient).
+* Utility functions for forward passes, action selection, and backpropagation.
+---
+#### 2) `training.py`
+Provides the **training loop** and supporting components:
+* **ReplayBuffer** for experience storage and sampling.
+* **Exploration noise** injection to encourage policy exploration.
+* Iterative training loop that:
+  * Interacts with the environment.
+  * Stores experiences.
+  * Periodically samples minibatches to update actor/critic networks.
+* Tracks and logs progress (episode rewards, critic/actor loss) with `tqdm`.
+---
+#### 3) `main.py`
+Serves as the **entry point** to run training:
+* Initializes the Booster Soccer Showdown environment via the **SAI client**.
+* Defines a **Preprocessor** to normalize and concatenate robot state, ball state, and environment info into a training-ready observation vector.
+* Instantiates a **DDPG_FF model** with custom architecture.
+* Defines an **action function** that rescales raw policy outputs to environment-specific action bounds.
+* Calls the training loop, and after training, supports:
+  * `sai.watch(...)` for visualizing learned behavior.
+  * `sai.benchmark(...)` for local benchmarking.
+---
+#### Example: Run Training
+```bash
+python training_scripts/main.py
+```
+This will:
+1. Build the environment.
+2. Initialize the model.
+3. Run the training loop with replay buffer and DDPG updates.
+4. Launch visualization and benchmarking after training.
+#### Example: Test pretrained model
+```bash
+python training_scripts/test.py --env LowerT1KickToTarget-v0
+```