--- license: mit language: - en tags: - reinforcement-learning - pytorch - dsac - aerospace - flight-control - boeing-747 - continuous-control - gymnasium library_name: tensoraerospace pipeline_tag: reinforcement-learning model-index: - name: DSAC-B747-PitchControl results: - task: type: reinforcement-learning name: Pitch Angle Tracking Control dataset: type: custom name: Boeing 747 Longitudinal Dynamics Simulation metrics: - type: overshoot value: 0.99 name: Overshoot (%) - type: settling_time value: 0.40 name: Settling Time (s) - type: rise_time value: 0.40 name: Rise Time (s) - type: peak_time value: 10.50 name: Peak Time (s) - type: static_error value: 0.0002 name: Static Error - type: performance_index value: 0.269 name: Performance Index --- # DSAC Agent for Boeing 747 Pitch Angle Control
![TensorAeroSpace](https://raw.githubusercontent.com/TensorAeroSpace/TensorAeroSpace/main/img/logo-no-background.png) **Distributional Soft Actor-Critic (DSAC) for Longitudinal Aircraft Control** [![TensorAeroSpace](https://img.shields.io/badge/%F0%9F%9A%80-TensorAeroSpace-blue)](https://github.com/TensorAeroSpace/TensorAeroSpace) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)](https://pytorch.org/)
## Model Description This model is a **Distributional Soft Actor-Critic (DSAC)** agent trained to control the pitch angle (θ) of a **Boeing 747** aircraft in a longitudinal flight dynamics simulation. The agent receives normalized state observations and outputs continuous elevator deflection commands to track reference pitch angle signals. ![image](https://cdn-uploads.huggingface.co/production/uploads/602bf7c9c4f8038e9a1e0a65/7QBAzPCcRlo6KNcTe58J2.png) ![image](https://cdn-uploads.huggingface.co/production/uploads/602bf7c9c4f8038e9a1e0a65/6udlxWC573gza2EdoXEhv.png) ### Intended Uses - **Primary Use**: Automatic pitch angle tracking and stabilization for Boeing 747 aircraft simulation - **Research Applications**: Benchmarking RL algorithms for aerospace control systems - **Educational**: Learning reinforcement learning concepts in aerospace applications - **Hybrid Control**: Can be combined with PID/MPC controllers for robust flight control ### Model Architecture The DSAC agent consists of separate **Actor** and **Critic** neural networks with distributional value estimation: #### Actor Network (Policy) | Layer | Configuration | |-------|--------------| | Input | 5 (observation dim with reference) | | Hidden 1 | Linear(5, 256) + ReLU | | Hidden 2 | Linear(256, 256) + ReLU | | Output (μ) | Linear(256, 1) + Tanh | | Output (log σ) | Linear(256, 1), clamped | #### Twin Critic Networks (Distributional Q-Function) | Layer | Configuration | |-------|--------------| | Input | 5 (obs) + 1 (action) | | Hidden 1 | Linear(6, 256) + ReLU | | Hidden 2 | Linear(256, 256) + ReLU | | Output | Linear(256, 1) | ### State Space The observation vector consists of 5 normalized states representing the longitudinal dynamics (with `include_reference_in_obs=True`): | Index | State | Description | Units | |-------|-------|-------------|-------| | 0 | u | Forward velocity perturbation | normalized | | 1 | w | Vertical velocity perturbation | normalized | | 2 | q | Pitch rate | normalized | | 3 | θ | Pitch angle (tracking target) | normalized | | 4 | θ_ref | Reference pitch angle | normalized | ### Action Space | Dimension | Description | Range | |-----------|-------------|-------| | 1 | Elevator deflection | [-1.0, 1.0] (normalized) | The normalized action is scaled to physical elevator deflection in degrees by the environment. ## Training Details ### Environment Configuration | Parameter | Value | |-----------|-------| | Environment | `ImprovedB747Env` | | Time Step (dt) | 0.1 s | | Episode Duration | 20 s | | Initial State | [0, 0, 0, 0] | | Reference Signal | Step function | | Step Amplitude | 1.0° | | Step Time | 5.0 s | | Reward Mode | `step_response` | | Include Reference in Obs | True | ### Training Infrastructure - **Hardware**: CPU / GPU / MPS (auto-select) - **Framework**: PyTorch 2.0+ ## Evaluation Results ### Performance Metrics | Metric | Value | |--------|-------| | **Overshoot** | 0.99% | | **Settling Time (±5%)** | 0.40 s | | **Rise Time** | 0.40 s | | **Peak Time** | 10.50 s | | **Static Error** | -0.0002 | | **Oscillation Count** | 0 | | **Performance Index** | 0.269 | ### Integral Criteria | Criterion | Value | |-----------|-------| | IAE (Integral Absolute Error) | 0.05 | | ISE (Integral Squared Error) | 0.00 | | ITAE (Integral Time-weighted Absolute Error) | 0.18 | ### Step Response Characteristics The agent demonstrates excellent step tracking performance with: - ✅ Minimal overshoot (~1%) - ✅ Fast settling time (0.4s) - ✅ Quick rise time (0.4s) - ✅ Near-zero static error - ✅ No oscillations ## Usage ### Installation ```bash pip install tensoraerospace ``` ### Quick Start ```python import numpy as np import torch from tensoraerospace.agent import DSAC from tensoraerospace.envs.b747 import ImprovedB747Env from tensoraerospace.signals.standart import unit_step def pick_device() -> str: if torch.cuda.is_available(): return "cuda" if getattr(torch.backends, "mps", None) is not None and torch.backends.mps.is_available(): return "mps" return "cpu" # Setup environment dt = 0.1 tn = 20.0 step_deg = 1.0 step_time_sec = 5.0 t = np.arange(0.0, tn + dt, dt, dtype=np.float32) # Create step reference signal (1 degree step at t=5s) ref = unit_step(t, degree=step_deg, time_step=step_time_sec, output_rad=True).reshape(1, -1) env = ImprovedB747Env( initial_state=np.array([0.0, 0.0, 0.0, 0.0], dtype=float), reference_signal=ref, number_time_steps=ref.shape[1], dt=dt, include_reference_in_obs=True, reward_mode="step_response", ) # Load pretrained agent agent = DSAC.from_pretrained("TensorAeroSpace/dsac-b747-step-response") agent.env = env agent.to_device(pick_device()) agent.eval() # Run evaluation obs, _ = env.reset() done = False total_reward = 0.0 while not done: action = agent.select_action(obs, evaluate=True) obs, reward, terminated, truncated, info = env.step(action) done = bool(terminated or truncated) total_reward += float(reward) print(f"Episode reward: {total_reward}") ``` ### Load from Local Checkpoint ```python from tensoraerospace.agent import DSAC # Load from local directory agent = DSAC.from_pretrained("./path/to/checkpoint") ``` ## Limitations - **Fixed Aircraft Model**: Trained specifically on Boeing 747 longitudinal dynamics; may not generalize to other aircraft - **Step Reference Only**: Optimized for step reference tracking; performance on other signal types (sine, ramp) may vary - **Simulation Gap**: Trained in simulation; real-world deployment would require additional validation - **State Observability**: Assumes all longitudinal states are observable - **Linear Dynamics**: Based on linearized aircraft model around trim conditions ## Ethical Considerations - **Not for Real Flight Control**: This model is for research and educational purposes only. It should NOT be used for actual aircraft control systems without extensive testing, certification, and regulatory approval. - **Simulation Only**: All training and evaluation performed in simulation environments. ## Citation If you use this model in your research, please cite: ```bibtex @software{tensoraerospace2024, title = {TensorAeroSpace: Advanced Aerospace Control Systems \& Reinforcement Learning Framework}, author = {TensorAeroSpace Team}, year = {2024}, url = {https://github.com/TensorAeroSpace/TensorAeroSpace}, license = {MIT} } ``` ## Model Card Authors TensorAeroSpace Team ## Model Card Contact - **GitHub**: [TensorAeroSpace/TensorAeroSpace](https://github.com/TensorAeroSpace/TensorAeroSpace) - **Documentation**: [tensoraerospace.readthedocs.io](https://tensoraerospace.readthedocs.io/) - **Hugging Face**: [TensorAeroSpace](https://huggingface.co/TensorAeroSpace)