TensorAeroSpace
/

ppo-b747-step-response

+---
+license: mit
+language:
+  - en
+tags:
+  - reinforcement-learning
+  - pytorch
+  - ppo
+  - aerospace
+  - flight-control
+  - boeing-747
+  - continuous-control
+  - gymnasium
+library_name: tensoraerospace
+pipeline_tag: reinforcement-learning
+model-index:
+  - name: PPO-B747-PitchControl
+    results:
+      - task:
+          type: reinforcement-learning
+          name: Pitch Angle Tracking Control
+        dataset:
+          type: custom
+          name: Boeing 747 Longitudinal Dynamics Simulation
+        metrics:
+          - type: eval_reward
+            value: 0.9137
+            name: Best Evaluation Reward
+          - type: overshoot
+            value: 0.49
+            name: Overshoot (%)
+          - type: settling_time
+            value: 0.60
+            name: Settling Time (s)
+          - type: rise_time
+            value: 0.30
+            name: Rise Time (s)
+          - type: static_error
+            value: 0.0046
+            name: Static Error
+---
+# PPO Agent for Boeing 747 Pitch Angle Control
+<div align="center">
+![TensorAeroSpace](https://raw.githubusercontent.com/TensorAeroSpace/TensorAeroSpace/main/img/logo-no-background.png)
+**Proximal Policy Optimization (PPO) for Longitudinal Aircraft Control**
+[![TensorAeroSpace](https://img.shields.io/badge/%F0%9F%9A%80-TensorAeroSpace-blue)](https://github.com/TensorAeroSpace/TensorAeroSpace)
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+[![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)](https://pytorch.org/)
+</div>
+## Model Description
+This model is a **Proximal Policy Optimization (PPO)** agent trained to control the pitch angle (θ) of a **Boeing 747** aircraft in a longitudinal flight dynamics simulation. The agent receives normalized state observations and outputs continuous elevator deflection commands to track reference pitch angle signals.
+![image](https://cdn-uploads.huggingface.co/production/uploads/602bf7c9c4f8038e9a1e0a65/g79y7SGa8VyXCDqDjd_GO.png)
+![image](https://cdn-uploads.huggingface.co/production/uploads/602bf7c9c4f8038e9a1e0a65/OZcb5JP_txYA9WEqjHGa5.png)
+### Intended Uses
+- **Primary Use**: Automatic pitch angle tracking and stabilization for Boeing 747 aircraft simulation
+- **Research Applications**: Benchmarking RL algorithms for aerospace control systems
+- **Educational**: Learning reinforcement learning concepts in aerospace applications
+- **Hybrid Control**: Can be combined with PID/MPC controllers for robust flight control
+### Model Architecture
+The PPO agent consists of separate **Actor** and **Critic** neural networks:
+#### Actor Network (Policy)
+| Layer | Configuration |
+|-------|--------------|
+| Input | 4 (observation dim) |
+| Hidden 1 | Linear(4, 256) + ReLU |
+| Hidden 2 | Linear(256, 256) + ReLU |
+| Output (μ) | Linear(256, 1) + Tanh |
+| Output (log σ) | Linear(256, 1), clamped to [-5.0, -1.5] |
+#### Critic Network (Value Function)
+| Layer | Configuration |
+|-------|--------------|
+| Input | 4 (observation dim) |
+| Hidden 1 | Linear(4, 256) + ReLU |
+| Hidden 2 | Linear(256, 256) + ReLU |
+| Output | Linear(256, 1) |
+### State Space
+The observation vector consists of 4 normalized states representing the longitudinal dynamics:
+| Index | State | Description | Units |
+|-------|-------|-------------|-------|
+| 0 | u | Forward velocity perturbation | normalized |
+| 1 | w | Vertical velocity perturbation | normalized |
+| 2 | q | Pitch rate | normalized |
+| 3 | θ | Pitch angle (tracking target) | normalized |
+### Action Space
+| Dimension | Description | Range |
+|-----------|-------------|-------|
+| 1 | Elevator deflection | [-1.0, 1.0] (normalized) |
+The normalized action is scaled to physical elevator deflection in degrees by the environment.
+## Training Details
+### Training Configuration
+| Hyperparameter | Value |
+|----------------|-------|
+| Algorithm | PPO (Clip) |
+| Max Episodes | 90,000 |
+| Rollout Length | 256 steps |
+| Batch Size | 16,384 |
+| Epochs per Update | 2 |
+| Clip Parameter (ε) | 0.15 |
+| Discount Factor (γ) | 0.995 |
+| GAE Lambda (λ) | 0.95 |
+| Actor Learning Rate | 1e-4 |
+| Critic Learning Rate | 2e-4 |
+| Entropy Coefficient | 0.01 |
+| Max Gradient Norm | 0.5 |
+| Target KL | 0.01 |
+| Normalize Observations | False |
+| Normalize Rewards | True |
+### Environment Configuration
+| Parameter | Value |
+|-----------|-------|
+| Environment | `ImprovedB747VecEnvTorch` |
+| Number of Parallel Envs | 64 |
+| Time Step (dt) | 0.1 s |
+| Episode Duration | 20 s |
+| Initial State | [0, 0, 0, 0] |
+| Reference Signal | Step function |
+| Step Amplitude Range | 1.0° |
+| Step Time Range | 5.0 s |
+### Training Infrastructure
+- **Hardware**: NVIDIA GPU with CUDA support
+- **Framework**: PyTorch 2.0+
+- **Training Time**: ~7,510 episodes to best checkpoint
+- **Best Episode**: 7,510
+## Evaluation Results
+### Performance Metrics
+| Metric | Value |
+|--------|-------|
+| **Best Evaluation Reward** | 0.9137 |
+| **Overshoot** | 0.49% |
+| **Settling Time** | 0.60 s |
+| **Rise Time** | 0.30 s |
+| **Peak Time** | 0.80 s |
+| **Static Error** | -0.0046 |
+| **Oscillation Count** | 1 |
+| **Performance Index** | 3.06 |
+### Integral Criteria
+| Criterion | Value |
+|-----------|-------|
+| IAE (Integral Absolute Error) | 4.08 |
+| ISE (Integral Squared Error) | 2.64 |
+| ITAE (Integral Time-weighted Absolute Error) | 4.77 |
+### Step Response Characteristics
+The agent demonstrates excellent step tracking performance with:
+- ✅ Minimal overshoot (<1%)
+- ✅ Fast settling time (0.6s)
+- ✅ Quick rise time (0.3s)
+- ✅ Near-zero static error
+- ✅ Minimal oscillations (1 cycle)
+## Usage
+### Installation
+```bash
+pip install tensoraerospace
+```
+### Quick Start
+```python
+import numpy as np
+import torch
+from tensoraerospace.agent.ppo.model import PPO
+from tensoraerospace.envs.b747 import ImprovedB747Env
+from tensoraerospace.signals.standart import unit_step
+from tensoraerospace.utils import generate_time_period, convert_tp_to_sec_tp
+# Load pretrained agent
+agent = PPO.from_pretrained("TensorAeroSpace/ppo-b747-pitch-control")
+# Setup environment
+dt = 0.1
+tp = generate_time_period(tn=20, dt=dt)
+tps = convert_tp_to_sec_tp(tp, dt=dt)
+# Create step reference signal (1 degree step at t=5s)
+reference = unit_step(tp=tps, degree=1.0, time_step=5.0, output_rad=True).reshape(1, -1)
+env = ImprovedB747Env(
+    initial_state=np.array([0.0, 0.0, 0.0, 0.0], dtype=np.float32),
+    reference_signal=reference,
+    number_time_steps=len(tp),
+    dt=dt,
+)
+# Run evaluation
+obs, _ = env.reset()
+done = False
+while not done:
+    action, mean_action, _ = agent.act(obs, deterministic=True)
+    action_scalar = float(np.asarray(mean_action).flatten()[0])
+    obs, reward, terminated, truncated, info = env.step(action_scalar)
+    done = terminated or truncated
+```
+### Load from Local Checkpoint
+```python
+from tensoraerospace.agent.ppo.model import PPO
+# Load from local directory
+agent = PPO.from_pretrained("./path/to/checkpoint")
+```
+## Limitations
+- **Fixed Aircraft Model**: Trained specifically on Boeing 747 longitudinal dynamics; may not generalize to other aircraft
+- **Step Reference Only**: Optimized for step reference tracking; performance on other signal types (sine, ramp) may vary
+- **Simulation Gap**: Trained in simulation; real-world deployment would require additional validation
+- **State Observability**: Assumes all 4 longitudinal states are observable
+- **Linear Dynamics**: Based on linearized aircraft model around trim conditions
+## Ethical Considerations
+- **Not for Real Flight Control**: This model is for research and educational purposes only. It should NOT be used for actual aircraft control systems without extensive testing, certification, and regulatory approval.
+- **Simulation Only**: All training and evaluation performed in simulation environments.
+## Citation
+If you use this model in your research, please cite:
+```bibtex
+@software{tensoraerospace2024,
+  title = {TensorAeroSpace: Advanced Aerospace Control Systems \& Reinforcement Learning Framework},
+  author = {TensorAeroSpace Team},
+  year = {2024},
+  url = {https://github.com/TensorAeroSpace/TensorAeroSpace},
+  license = {MIT}
+}
+```
+## Model Card Authors
+TensorAeroSpace Team
+## Model Card Contact
+- **GitHub**: [TensorAeroSpace/TensorAeroSpace](https://github.com/TensorAeroSpace/TensorAeroSpace)
+- **Documentation**: [tensoraerospace.readthedocs.io](https://tensoraerospace.readthedocs.io/)
+- **Hugging Face**: [TensorAeroSpace](https://huggingface.co/TensorAeroSpace)