Create README.md

b74c508 verified 14 days ago

8.54 kB

	---
	license: mit
	language:
	- en
	tags:
	- reinforcement-learning
	- pytorch
	- ppo
	- aerospace
	- flight-control
	- boeing-747
	- continuous-control
	- gymnasium
	library_name: tensoraerospace
	pipeline_tag: reinforcement-learning
	model-index:
	- name: PPO-B747-PitchControl
	results:
	- task:
	type: reinforcement-learning
	name: Pitch Angle Tracking Control
	dataset:
	type: custom
	name: Boeing 747 Longitudinal Dynamics Simulation
	metrics:
	- type: eval_reward
	value: 0.9137
	name: Best Evaluation Reward
	- type: overshoot
	value: 0.49
	name: Overshoot (%)
	- type: settling_time
	value: 0.60
	name: Settling Time (s)
	- type: rise_time
	value: 0.30
	name: Rise Time (s)
	- type: static_error
	value: 0.0046
	name: Static Error
	---

	# PPO Agent for Boeing 747 Pitch Angle Control

	<div align="center">

	![TensorAeroSpace](https://raw.githubusercontent.com/TensorAeroSpace/TensorAeroSpace/main/img/logo-no-background.png)

	Proximal Policy Optimization (PPO) for Longitudinal Aircraft Control

	[![TensorAeroSpace](https://img.shields.io/badge/%F0%9F%9A%80-TensorAeroSpace-blue)](https://github.com/TensorAeroSpace/TensorAeroSpace)
	[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
	[![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)](https://pytorch.org/)

	</div>

	## Model Description

	This model is a Proximal Policy Optimization (PPO) agent trained to control the pitch angle (θ) of a Boeing 747 aircraft in a longitudinal flight dynamics simulation. The agent receives normalized state observations and outputs continuous elevator deflection commands to track reference pitch angle signals.

	![image](https://cdn-uploads.huggingface.co/production/uploads/602bf7c9c4f8038e9a1e0a65/g79y7SGa8VyXCDqDjd_GO.png)

	![image](https://cdn-uploads.huggingface.co/production/uploads/602bf7c9c4f8038e9a1e0a65/OZcb5JP_txYA9WEqjHGa5.png)

	### Intended Uses

	- Primary Use: Automatic pitch angle tracking and stabilization for Boeing 747 aircraft simulation
	- Research Applications: Benchmarking RL algorithms for aerospace control systems
	- Educational: Learning reinforcement learning concepts in aerospace applications
	- Hybrid Control: Can be combined with PID/MPC controllers for robust flight control

	### Model Architecture

	The PPO agent consists of separate Actor and Critic neural networks:

	#### Actor Network (Policy)
	\| Layer \| Configuration \|
	\|-------\|--------------\|
	\| Input \| 4 (observation dim) \|
	\| Hidden 1 \| Linear(4, 256) + ReLU \|
	\| Hidden 2 \| Linear(256, 256) + ReLU \|
	\| Output (μ) \| Linear(256, 1) + Tanh \|
	\| Output (log σ) \| Linear(256, 1), clamped to [-5.0, -1.5] \|

	#### Critic Network (Value Function)
	\| Layer \| Configuration \|
	\|-------\|--------------\|
	\| Input \| 4 (observation dim) \|
	\| Hidden 1 \| Linear(4, 256) + ReLU \|
	\| Hidden 2 \| Linear(256, 256) + ReLU \|
	\| Output \| Linear(256, 1) \|

	### State Space

	The observation vector consists of 4 normalized states representing the longitudinal dynamics:

	\| Index \| State \| Description \| Units \|
	\|-------\|-------\|-------------\|-------\|
	\| 0 \| u \| Forward velocity perturbation \| normalized \|
	\| 1 \| w \| Vertical velocity perturbation \| normalized \|
	\| 2 \| q \| Pitch rate \| normalized \|
	\| 3 \| θ \| Pitch angle (tracking target) \| normalized \|

	### Action Space

	\| Dimension \| Description \| Range \|
	\|-----------\|-------------\|-------\|
	\| 1 \| Elevator deflection \| [-1.0, 1.0] (normalized) \|

	The normalized action is scaled to physical elevator deflection in degrees by the environment.

	## Training Details

	### Training Configuration

	\| Hyperparameter \| Value \|
	\|----------------\|-------\|
	\| Algorithm \| PPO (Clip) \|
	\| Max Episodes \| 90,000 \|
	\| Rollout Length \| 256 steps \|
	\| Batch Size \| 16,384 \|
	\| Epochs per Update \| 2 \|
	\| Clip Parameter (ε) \| 0.15 \|
	\| Discount Factor (γ) \| 0.995 \|
	\| GAE Lambda (λ) \| 0.95 \|
	\| Actor Learning Rate \| 1e-4 \|
	\| Critic Learning Rate \| 2e-4 \|
	\| Entropy Coefficient \| 0.01 \|
	\| Max Gradient Norm \| 0.5 \|
	\| Target KL \| 0.01 \|
	\| Normalize Observations \| False \|
	\| Normalize Rewards \| True \|

	### Environment Configuration

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Environment \| `ImprovedB747VecEnvTorch` \|
	\| Number of Parallel Envs \| 64 \|
	\| Time Step (dt) \| 0.1 s \|
	\| Episode Duration \| 20 s \|
	\| Initial State \| [0, 0, 0, 0] \|
	\| Reference Signal \| Step function \|
	\| Step Amplitude Range \| 1.0° \|
	\| Step Time Range \| 5.0 s \|

	### Training Infrastructure

	- Hardware: NVIDIA GPU with CUDA support
	- Framework: PyTorch 2.0+
	- Training Time: ~7,510 episodes to best checkpoint
	- Best Episode: 7,510

	## Evaluation Results

	### Performance Metrics

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Best Evaluation Reward \| 0.9137 \|
	\| Overshoot \| 0.49% \|
	\| Settling Time \| 0.60 s \|
	\| Rise Time \| 0.30 s \|
	\| Peak Time \| 0.80 s \|
	\| Static Error \| -0.0046 \|
	\| Oscillation Count \| 1 \|
	\| Performance Index \| 3.06 \|

	### Integral Criteria

	\| Criterion \| Value \|
	\|-----------\|-------\|
	\| IAE (Integral Absolute Error) \| 4.08 \|
	\| ISE (Integral Squared Error) \| 2.64 \|
	\| ITAE (Integral Time-weighted Absolute Error) \| 4.77 \|

	### Step Response Characteristics

	The agent demonstrates excellent step tracking performance with:
	- ✅ Minimal overshoot (<1%)
	- ✅ Fast settling time (0.6s)
	- ✅ Quick rise time (0.3s)
	- ✅ Near-zero static error
	- ✅ Minimal oscillations (1 cycle)

	## Usage

	### Installation

	```bash
	pip install tensoraerospace
	```

	### Quick Start

	```python
	import numpy as np
	import torch
	from tensoraerospace.agent.ppo.model import PPO
	from tensoraerospace.envs.b747 import ImprovedB747Env
	from tensoraerospace.signals.standart import unit_step
	from tensoraerospace.utils import generate_time_period, convert_tp_to_sec_tp

	# Load pretrained agent
	agent = PPO.from_pretrained("TensorAeroSpace/ppo-b747-pitch-control")

	# Setup environment
	dt = 0.1
	tp = generate_time_period(tn=20, dt=dt)
	tps = convert_tp_to_sec_tp(tp, dt=dt)

	# Create step reference signal (1 degree step at t=5s)
	reference = unit_step(tp=tps, degree=1.0, time_step=5.0, output_rad=True).reshape(1, -1)

	env = ImprovedB747Env(
	initial_state=np.array([0.0, 0.0, 0.0, 0.0], dtype=np.float32),
	reference_signal=reference,
	number_time_steps=len(tp),
	dt=dt,
	)

	# Run evaluation
	obs, _ = env.reset()
	done = False

	while not done:
	action, mean_action, _ = agent.act(obs, deterministic=True)
	action_scalar = float(np.asarray(mean_action).flatten()[0])
	obs, reward, terminated, truncated, info = env.step(action_scalar)
	done = terminated or truncated
	```

	### Load from Local Checkpoint

	```python
	from tensoraerospace.agent.ppo.model import PPO

	# Load from local directory
	agent = PPO.from_pretrained("./path/to/checkpoint")
	```

	## Limitations

	- Fixed Aircraft Model: Trained specifically on Boeing 747 longitudinal dynamics; may not generalize to other aircraft
	- Step Reference Only: Optimized for step reference tracking; performance on other signal types (sine, ramp) may vary
	- Simulation Gap: Trained in simulation; real-world deployment would require additional validation
	- State Observability: Assumes all 4 longitudinal states are observable
	- Linear Dynamics: Based on linearized aircraft model around trim conditions

	## Ethical Considerations

	- Not for Real Flight Control: This model is for research and educational purposes only. It should NOT be used for actual aircraft control systems without extensive testing, certification, and regulatory approval.
	- Simulation Only: All training and evaluation performed in simulation environments.

	## Citation

	If you use this model in your research, please cite:

	```bibtex
	@software{tensoraerospace2024,
	title = {TensorAeroSpace: Advanced Aerospace Control Systems \& Reinforcement Learning Framework},
	author = {TensorAeroSpace Team},
	year = {2024},
	url = {https://github.com/TensorAeroSpace/TensorAeroSpace},
	license = {MIT}
	}
	```

	## Model Card Authors

	TensorAeroSpace Team

	## Model Card Contact

	- GitHub: [TensorAeroSpace/TensorAeroSpace](https://github.com/TensorAeroSpace/TensorAeroSpace)
	- Documentation: [tensoraerospace.readthedocs.io](https://tensoraerospace.readthedocs.io/)
	- Hugging Face: [TensorAeroSpace](https://huggingface.co/TensorAeroSpace)