Mr8bit's picture
Create README.md
b615abb verified
---
license: mit
language:
- en
tags:
- reinforcement-learning
- pytorch
- dsac
- aerospace
- flight-control
- boeing-747
- continuous-control
- gymnasium
library_name: tensoraerospace
pipeline_tag: reinforcement-learning
model-index:
- name: DSAC-B747-PitchControl
results:
- task:
type: reinforcement-learning
name: Pitch Angle Tracking Control
dataset:
type: custom
name: Boeing 747 Longitudinal Dynamics Simulation
metrics:
- type: overshoot
value: 0.99
name: Overshoot (%)
- type: settling_time
value: 0.40
name: Settling Time (s)
- type: rise_time
value: 0.40
name: Rise Time (s)
- type: peak_time
value: 10.50
name: Peak Time (s)
- type: static_error
value: 0.0002
name: Static Error
- type: performance_index
value: 0.269
name: Performance Index
---
# DSAC Agent for Boeing 747 Pitch Angle Control
<div align="center">
![TensorAeroSpace](https://raw.githubusercontent.com/TensorAeroSpace/TensorAeroSpace/main/img/logo-no-background.png)
**Distributional Soft Actor-Critic (DSAC) for Longitudinal Aircraft Control**
[![TensorAeroSpace](https://img.shields.io/badge/%F0%9F%9A%80-TensorAeroSpace-blue)](https://github.com/TensorAeroSpace/TensorAeroSpace)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)](https://pytorch.org/)
</div>
## Model Description
This model is a **Distributional Soft Actor-Critic (DSAC)** agent trained to control the pitch angle (θ) of a **Boeing 747** aircraft in a longitudinal flight dynamics simulation. The agent receives normalized state observations and outputs continuous elevator deflection commands to track reference pitch angle signals.
![image](https://cdn-uploads.huggingface.co/production/uploads/602bf7c9c4f8038e9a1e0a65/7QBAzPCcRlo6KNcTe58J2.png)
![image](https://cdn-uploads.huggingface.co/production/uploads/602bf7c9c4f8038e9a1e0a65/6udlxWC573gza2EdoXEhv.png)
### Intended Uses
- **Primary Use**: Automatic pitch angle tracking and stabilization for Boeing 747 aircraft simulation
- **Research Applications**: Benchmarking RL algorithms for aerospace control systems
- **Educational**: Learning reinforcement learning concepts in aerospace applications
- **Hybrid Control**: Can be combined with PID/MPC controllers for robust flight control
### Model Architecture
The DSAC agent consists of separate **Actor** and **Critic** neural networks with distributional value estimation:
#### Actor Network (Policy)
| Layer | Configuration |
|-------|--------------|
| Input | 5 (observation dim with reference) |
| Hidden 1 | Linear(5, 256) + ReLU |
| Hidden 2 | Linear(256, 256) + ReLU |
| Output (μ) | Linear(256, 1) + Tanh |
| Output (log σ) | Linear(256, 1), clamped |
#### Twin Critic Networks (Distributional Q-Function)
| Layer | Configuration |
|-------|--------------|
| Input | 5 (obs) + 1 (action) |
| Hidden 1 | Linear(6, 256) + ReLU |
| Hidden 2 | Linear(256, 256) + ReLU |
| Output | Linear(256, 1) |
### State Space
The observation vector consists of 5 normalized states representing the longitudinal dynamics (with `include_reference_in_obs=True`):
| Index | State | Description | Units |
|-------|-------|-------------|-------|
| 0 | u | Forward velocity perturbation | normalized |
| 1 | w | Vertical velocity perturbation | normalized |
| 2 | q | Pitch rate | normalized |
| 3 | θ | Pitch angle (tracking target) | normalized |
| 4 | θ_ref | Reference pitch angle | normalized |
### Action Space
| Dimension | Description | Range |
|-----------|-------------|-------|
| 1 | Elevator deflection | [-1.0, 1.0] (normalized) |
The normalized action is scaled to physical elevator deflection in degrees by the environment.
## Training Details
### Environment Configuration
| Parameter | Value |
|-----------|-------|
| Environment | `ImprovedB747Env` |
| Time Step (dt) | 0.1 s |
| Episode Duration | 20 s |
| Initial State | [0, 0, 0, 0] |
| Reference Signal | Step function |
| Step Amplitude | 1.0° |
| Step Time | 5.0 s |
| Reward Mode | `step_response` |
| Include Reference in Obs | True |
### Training Infrastructure
- **Hardware**: CPU / GPU / MPS (auto-select)
- **Framework**: PyTorch 2.0+
## Evaluation Results
### Performance Metrics
| Metric | Value |
|--------|-------|
| **Overshoot** | 0.99% |
| **Settling Time (±5%)** | 0.40 s |
| **Rise Time** | 0.40 s |
| **Peak Time** | 10.50 s |
| **Static Error** | -0.0002 |
| **Oscillation Count** | 0 |
| **Performance Index** | 0.269 |
### Integral Criteria
| Criterion | Value |
|-----------|-------|
| IAE (Integral Absolute Error) | 0.05 |
| ISE (Integral Squared Error) | 0.00 |
| ITAE (Integral Time-weighted Absolute Error) | 0.18 |
### Step Response Characteristics
The agent demonstrates excellent step tracking performance with:
- ✅ Minimal overshoot (~1%)
- ✅ Fast settling time (0.4s)
- ✅ Quick rise time (0.4s)
- ✅ Near-zero static error
- ✅ No oscillations
## Usage
### Installation
```bash
pip install tensoraerospace
```
### Quick Start
```python
import numpy as np
import torch
from tensoraerospace.agent import DSAC
from tensoraerospace.envs.b747 import ImprovedB747Env
from tensoraerospace.signals.standart import unit_step
def pick_device() -> str:
if torch.cuda.is_available():
return "cuda"
if getattr(torch.backends, "mps", None) is not None and torch.backends.mps.is_available():
return "mps"
return "cpu"
# Setup environment
dt = 0.1
tn = 20.0
step_deg = 1.0
step_time_sec = 5.0
t = np.arange(0.0, tn + dt, dt, dtype=np.float32)
# Create step reference signal (1 degree step at t=5s)
ref = unit_step(t, degree=step_deg, time_step=step_time_sec, output_rad=True).reshape(1, -1)
env = ImprovedB747Env(
initial_state=np.array([0.0, 0.0, 0.0, 0.0], dtype=float),
reference_signal=ref,
number_time_steps=ref.shape[1],
dt=dt,
include_reference_in_obs=True,
reward_mode="step_response",
)
# Load pretrained agent
agent = DSAC.from_pretrained("TensorAeroSpace/dsac-b747-step-response")
agent.env = env
agent.to_device(pick_device())
agent.eval()
# Run evaluation
obs, _ = env.reset()
done = False
total_reward = 0.0
while not done:
action = agent.select_action(obs, evaluate=True)
obs, reward, terminated, truncated, info = env.step(action)
done = bool(terminated or truncated)
total_reward += float(reward)
print(f"Episode reward: {total_reward}")
```
### Load from Local Checkpoint
```python
from tensoraerospace.agent import DSAC
# Load from local directory
agent = DSAC.from_pretrained("./path/to/checkpoint")
```
## Limitations
- **Fixed Aircraft Model**: Trained specifically on Boeing 747 longitudinal dynamics; may not generalize to other aircraft
- **Step Reference Only**: Optimized for step reference tracking; performance on other signal types (sine, ramp) may vary
- **Simulation Gap**: Trained in simulation; real-world deployment would require additional validation
- **State Observability**: Assumes all longitudinal states are observable
- **Linear Dynamics**: Based on linearized aircraft model around trim conditions
## Ethical Considerations
- **Not for Real Flight Control**: This model is for research and educational purposes only. It should NOT be used for actual aircraft control systems without extensive testing, certification, and regulatory approval.
- **Simulation Only**: All training and evaluation performed in simulation environments.
## Citation
If you use this model in your research, please cite:
```bibtex
@software{tensoraerospace2024,
title = {TensorAeroSpace: Advanced Aerospace Control Systems \& Reinforcement Learning Framework},
author = {TensorAeroSpace Team},
year = {2024},
url = {https://github.com/TensorAeroSpace/TensorAeroSpace},
license = {MIT}
}
```
## Model Card Authors
TensorAeroSpace Team
## Model Card Contact
- **GitHub**: [TensorAeroSpace/TensorAeroSpace](https://github.com/TensorAeroSpace/TensorAeroSpace)
- **Documentation**: [tensoraerospace.readthedocs.io](https://tensoraerospace.readthedocs.io/)
- **Hugging Face**: [TensorAeroSpace](https://huggingface.co/TensorAeroSpace)