File size: 8,543 Bytes

b74c508

---
license: mit
language:
  - en
tags:
  - reinforcement-learning
  - pytorch
  - ppo
  - aerospace
  - flight-control
  - boeing-747
  - continuous-control
  - gymnasium
library_name: tensoraerospace
pipeline_tag: reinforcement-learning
model-index:
  - name: PPO-B747-PitchControl
    results:
      - task:
          type: reinforcement-learning
          name: Pitch Angle Tracking Control
        dataset:
          type: custom
          name: Boeing 747 Longitudinal Dynamics Simulation
        metrics:
          - type: eval_reward
            value: 0.9137
            name: Best Evaluation Reward
          - type: overshoot
            value: 0.49
            name: Overshoot (%)
          - type: settling_time
            value: 0.60
            name: Settling Time (s)
          - type: rise_time
            value: 0.30
            name: Rise Time (s)
          - type: static_error
            value: 0.0046
            name: Static Error
---

# PPO Agent for Boeing 747 Pitch Angle Control

<div align="center">

![TensorAeroSpace](https://raw.githubusercontent.com/TensorAeroSpace/TensorAeroSpace/main/img/logo-no-background.png)

**Proximal Policy Optimization (PPO) for Longitudinal Aircraft Control**

[![TensorAeroSpace](https://img.shields.io/badge/%F0%9F%9A%80-TensorAeroSpace-blue)](https://github.com/TensorAeroSpace/TensorAeroSpace)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)](https://pytorch.org/)

</div>

## Model Description

This model is a **Proximal Policy Optimization (PPO)** agent trained to control the pitch angle (θ) of a **Boeing 747** aircraft in a longitudinal flight dynamics simulation. The agent receives normalized state observations and outputs continuous elevator deflection commands to track reference pitch angle signals.

![image](https://cdn-uploads.huggingface.co/production/uploads/602bf7c9c4f8038e9a1e0a65/g79y7SGa8VyXCDqDjd_GO.png)

![image](https://cdn-uploads.huggingface.co/production/uploads/602bf7c9c4f8038e9a1e0a65/OZcb5JP_txYA9WEqjHGa5.png)

### Intended Uses

- **Primary Use**: Automatic pitch angle tracking and stabilization for Boeing 747 aircraft simulation
- **Research Applications**: Benchmarking RL algorithms for aerospace control systems
- **Educational**: Learning reinforcement learning concepts in aerospace applications
- **Hybrid Control**: Can be combined with PID/MPC controllers for robust flight control

### Model Architecture

The PPO agent consists of separate **Actor** and **Critic** neural networks:

#### Actor Network (Policy)
| Layer | Configuration |
|-------|--------------|
| Input | 4 (observation dim) |
| Hidden 1 | Linear(4, 256) + ReLU |
| Hidden 2 | Linear(256, 256) + ReLU |
| Output (μ) | Linear(256, 1) + Tanh |
| Output (log σ) | Linear(256, 1), clamped to [-5.0, -1.5] |

#### Critic Network (Value Function)
| Layer | Configuration |
|-------|--------------|
| Input | 4 (observation dim) |
| Hidden 1 | Linear(4, 256) + ReLU |
| Hidden 2 | Linear(256, 256) + ReLU |
| Output | Linear(256, 1) |

### State Space

The observation vector consists of 4 normalized states representing the longitudinal dynamics:

| Index | State | Description | Units |
|-------|-------|-------------|-------|
| 0 | u | Forward velocity perturbation | normalized |
| 1 | w | Vertical velocity perturbation | normalized |
| 2 | q | Pitch rate | normalized |
| 3 | θ | Pitch angle (tracking target) | normalized |

### Action Space

| Dimension | Description | Range |
|-----------|-------------|-------|
| 1 | Elevator deflection | [-1.0, 1.0] (normalized) |

The normalized action is scaled to physical elevator deflection in degrees by the environment.

## Training Details

### Training Configuration

| Hyperparameter | Value |
|----------------|-------|
| Algorithm | PPO (Clip) |
| Max Episodes | 90,000 |
| Rollout Length | 256 steps |
| Batch Size | 16,384 |
| Epochs per Update | 2 |
| Clip Parameter (ε) | 0.15 |
| Discount Factor (γ) | 0.995 |
| GAE Lambda (λ) | 0.95 |
| Actor Learning Rate | 1e-4 |
| Critic Learning Rate | 2e-4 |
| Entropy Coefficient | 0.01 |
| Max Gradient Norm | 0.5 |
| Target KL | 0.01 |
| Normalize Observations | False |
| Normalize Rewards | True |

### Environment Configuration

| Parameter | Value |
|-----------|-------|
| Environment | `ImprovedB747VecEnvTorch` |
| Number of Parallel Envs | 64 |
| Time Step (dt) | 0.1 s |
| Episode Duration | 20 s |
| Initial State | [0, 0, 0, 0] |
| Reference Signal | Step function |
| Step Amplitude Range | 1.0° |
| Step Time Range | 5.0 s |

### Training Infrastructure

- **Hardware**: NVIDIA GPU with CUDA support
- **Framework**: PyTorch 2.0+
- **Training Time**: ~7,510 episodes to best checkpoint
- **Best Episode**: 7,510

## Evaluation Results

### Performance Metrics

| Metric | Value |
|--------|-------|
| **Best Evaluation Reward** | 0.9137 |
| **Overshoot** | 0.49% |
| **Settling Time** | 0.60 s |
| **Rise Time** | 0.30 s |
| **Peak Time** | 0.80 s |
| **Static Error** | -0.0046 |
| **Oscillation Count** | 1 |
| **Performance Index** | 3.06 |

### Integral Criteria

| Criterion | Value |
|-----------|-------|
| IAE (Integral Absolute Error) | 4.08 |
| ISE (Integral Squared Error) | 2.64 |
| ITAE (Integral Time-weighted Absolute Error) | 4.77 |

### Step Response Characteristics

The agent demonstrates excellent step tracking performance with:
- ✅ Minimal overshoot (<1%)
- ✅ Fast settling time (0.6s)
- ✅ Quick rise time (0.3s)
- ✅ Near-zero static error
- ✅ Minimal oscillations (1 cycle)

## Usage

### Installation

```bash
pip install tensoraerospace
```

### Quick Start

```python
import numpy as np
import torch
from tensoraerospace.agent.ppo.model import PPO
from tensoraerospace.envs.b747 import ImprovedB747Env
from tensoraerospace.signals.standart import unit_step
from tensoraerospace.utils import generate_time_period, convert_tp_to_sec_tp

# Load pretrained agent
agent = PPO.from_pretrained("TensorAeroSpace/ppo-b747-pitch-control")

# Setup environment
dt = 0.1
tp = generate_time_period(tn=20, dt=dt)
tps = convert_tp_to_sec_tp(tp, dt=dt)

# Create step reference signal (1 degree step at t=5s)
reference = unit_step(tp=tps, degree=1.0, time_step=5.0, output_rad=True).reshape(1, -1)

env = ImprovedB747Env(
    initial_state=np.array([0.0, 0.0, 0.0, 0.0], dtype=np.float32),
    reference_signal=reference,
    number_time_steps=len(tp),
    dt=dt,
)

# Run evaluation
obs, _ = env.reset()
done = False

while not done:
    action, mean_action, _ = agent.act(obs, deterministic=True)
    action_scalar = float(np.asarray(mean_action).flatten()[0])
    obs, reward, terminated, truncated, info = env.step(action_scalar)
    done = terminated or truncated
```

### Load from Local Checkpoint

```python
from tensoraerospace.agent.ppo.model import PPO

# Load from local directory
agent = PPO.from_pretrained("./path/to/checkpoint")
```

## Limitations

- **Fixed Aircraft Model**: Trained specifically on Boeing 747 longitudinal dynamics; may not generalize to other aircraft
- **Step Reference Only**: Optimized for step reference tracking; performance on other signal types (sine, ramp) may vary
- **Simulation Gap**: Trained in simulation; real-world deployment would require additional validation
- **State Observability**: Assumes all 4 longitudinal states are observable
- **Linear Dynamics**: Based on linearized aircraft model around trim conditions

## Ethical Considerations

- **Not for Real Flight Control**: This model is for research and educational purposes only. It should NOT be used for actual aircraft control systems without extensive testing, certification, and regulatory approval.
- **Simulation Only**: All training and evaluation performed in simulation environments.

## Citation

If you use this model in your research, please cite:

```bibtex
@software{tensoraerospace2024,
  title = {TensorAeroSpace: Advanced Aerospace Control Systems \& Reinforcement Learning Framework},
  author = {TensorAeroSpace Team},
  year = {2024},
  url = {https://github.com/TensorAeroSpace/TensorAeroSpace},
  license = {MIT}
}
```

## Model Card Authors

TensorAeroSpace Team

## Model Card Contact

- **GitHub**: [TensorAeroSpace/TensorAeroSpace](https://github.com/TensorAeroSpace/TensorAeroSpace)
- **Documentation**: [tensoraerospace.readthedocs.io](https://tensoraerospace.readthedocs.io/)
- **Hugging Face**: [TensorAeroSpace](https://huggingface.co/TensorAeroSpace)