|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- reinforcement-learning |
|
|
- pytorch |
|
|
- dsac |
|
|
- aerospace |
|
|
- flight-control |
|
|
- boeing-747 |
|
|
- continuous-control |
|
|
- gymnasium |
|
|
library_name: tensoraerospace |
|
|
pipeline_tag: reinforcement-learning |
|
|
model-index: |
|
|
- name: DSAC-B747-PitchControl |
|
|
results: |
|
|
- task: |
|
|
type: reinforcement-learning |
|
|
name: Pitch Angle Tracking Control |
|
|
dataset: |
|
|
type: custom |
|
|
name: Boeing 747 Longitudinal Dynamics Simulation |
|
|
metrics: |
|
|
- type: overshoot |
|
|
value: 0.99 |
|
|
name: Overshoot (%) |
|
|
- type: settling_time |
|
|
value: 0.40 |
|
|
name: Settling Time (s) |
|
|
- type: rise_time |
|
|
value: 0.40 |
|
|
name: Rise Time (s) |
|
|
- type: peak_time |
|
|
value: 10.50 |
|
|
name: Peak Time (s) |
|
|
- type: static_error |
|
|
value: 0.0002 |
|
|
name: Static Error |
|
|
- type: performance_index |
|
|
value: 0.269 |
|
|
name: Performance Index |
|
|
--- |
|
|
|
|
|
# DSAC Agent for Boeing 747 Pitch Angle Control |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
 |
|
|
|
|
|
**Distributional Soft Actor-Critic (DSAC) for Longitudinal Aircraft Control** |
|
|
|
|
|
[](https://github.com/TensorAeroSpace/TensorAeroSpace) |
|
|
[](https://opensource.org/licenses/MIT) |
|
|
[](https://pytorch.org/) |
|
|
|
|
|
</div> |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model is a **Distributional Soft Actor-Critic (DSAC)** agent trained to control the pitch angle (θ) of a **Boeing 747** aircraft in a longitudinal flight dynamics simulation. The agent receives normalized state observations and outputs continuous elevator deflection commands to track reference pitch angle signals. |
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
 |
|
|
|
|
|
### Intended Uses |
|
|
|
|
|
- **Primary Use**: Automatic pitch angle tracking and stabilization for Boeing 747 aircraft simulation |
|
|
- **Research Applications**: Benchmarking RL algorithms for aerospace control systems |
|
|
- **Educational**: Learning reinforcement learning concepts in aerospace applications |
|
|
- **Hybrid Control**: Can be combined with PID/MPC controllers for robust flight control |
|
|
|
|
|
### Model Architecture |
|
|
|
|
|
The DSAC agent consists of separate **Actor** and **Critic** neural networks with distributional value estimation: |
|
|
|
|
|
#### Actor Network (Policy) |
|
|
| Layer | Configuration | |
|
|
|-------|--------------| |
|
|
| Input | 5 (observation dim with reference) | |
|
|
| Hidden 1 | Linear(5, 256) + ReLU | |
|
|
| Hidden 2 | Linear(256, 256) + ReLU | |
|
|
| Output (μ) | Linear(256, 1) + Tanh | |
|
|
| Output (log σ) | Linear(256, 1), clamped | |
|
|
|
|
|
#### Twin Critic Networks (Distributional Q-Function) |
|
|
| Layer | Configuration | |
|
|
|-------|--------------| |
|
|
| Input | 5 (obs) + 1 (action) | |
|
|
| Hidden 1 | Linear(6, 256) + ReLU | |
|
|
| Hidden 2 | Linear(256, 256) + ReLU | |
|
|
| Output | Linear(256, 1) | |
|
|
|
|
|
### State Space |
|
|
|
|
|
The observation vector consists of 5 normalized states representing the longitudinal dynamics (with `include_reference_in_obs=True`): |
|
|
|
|
|
| Index | State | Description | Units | |
|
|
|-------|-------|-------------|-------| |
|
|
| 0 | u | Forward velocity perturbation | normalized | |
|
|
| 1 | w | Vertical velocity perturbation | normalized | |
|
|
| 2 | q | Pitch rate | normalized | |
|
|
| 3 | θ | Pitch angle (tracking target) | normalized | |
|
|
| 4 | θ_ref | Reference pitch angle | normalized | |
|
|
|
|
|
### Action Space |
|
|
|
|
|
| Dimension | Description | Range | |
|
|
|-----------|-------------|-------| |
|
|
| 1 | Elevator deflection | [-1.0, 1.0] (normalized) | |
|
|
|
|
|
The normalized action is scaled to physical elevator deflection in degrees by the environment. |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Environment Configuration |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| Environment | `ImprovedB747Env` | |
|
|
| Time Step (dt) | 0.1 s | |
|
|
| Episode Duration | 20 s | |
|
|
| Initial State | [0, 0, 0, 0] | |
|
|
| Reference Signal | Step function | |
|
|
| Step Amplitude | 1.0° | |
|
|
| Step Time | 5.0 s | |
|
|
| Reward Mode | `step_response` | |
|
|
| Include Reference in Obs | True | |
|
|
|
|
|
### Training Infrastructure |
|
|
|
|
|
- **Hardware**: CPU / GPU / MPS (auto-select) |
|
|
- **Framework**: PyTorch 2.0+ |
|
|
|
|
|
## Evaluation Results |
|
|
|
|
|
### Performance Metrics |
|
|
|
|
|
| Metric | Value | |
|
|
|--------|-------| |
|
|
| **Overshoot** | 0.99% | |
|
|
| **Settling Time (±5%)** | 0.40 s | |
|
|
| **Rise Time** | 0.40 s | |
|
|
| **Peak Time** | 10.50 s | |
|
|
| **Static Error** | -0.0002 | |
|
|
| **Oscillation Count** | 0 | |
|
|
| **Performance Index** | 0.269 | |
|
|
|
|
|
### Integral Criteria |
|
|
|
|
|
| Criterion | Value | |
|
|
|-----------|-------| |
|
|
| IAE (Integral Absolute Error) | 0.05 | |
|
|
| ISE (Integral Squared Error) | 0.00 | |
|
|
| ITAE (Integral Time-weighted Absolute Error) | 0.18 | |
|
|
|
|
|
### Step Response Characteristics |
|
|
|
|
|
The agent demonstrates excellent step tracking performance with: |
|
|
- ✅ Minimal overshoot (~1%) |
|
|
- ✅ Fast settling time (0.4s) |
|
|
- ✅ Quick rise time (0.4s) |
|
|
- ✅ Near-zero static error |
|
|
- ✅ No oscillations |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
pip install tensoraerospace |
|
|
``` |
|
|
|
|
|
### Quick Start |
|
|
|
|
|
```python |
|
|
import numpy as np |
|
|
import torch |
|
|
from tensoraerospace.agent import DSAC |
|
|
from tensoraerospace.envs.b747 import ImprovedB747Env |
|
|
from tensoraerospace.signals.standart import unit_step |
|
|
|
|
|
def pick_device() -> str: |
|
|
if torch.cuda.is_available(): |
|
|
return "cuda" |
|
|
if getattr(torch.backends, "mps", None) is not None and torch.backends.mps.is_available(): |
|
|
return "mps" |
|
|
return "cpu" |
|
|
|
|
|
# Setup environment |
|
|
dt = 0.1 |
|
|
tn = 20.0 |
|
|
step_deg = 1.0 |
|
|
step_time_sec = 5.0 |
|
|
t = np.arange(0.0, tn + dt, dt, dtype=np.float32) |
|
|
|
|
|
# Create step reference signal (1 degree step at t=5s) |
|
|
ref = unit_step(t, degree=step_deg, time_step=step_time_sec, output_rad=True).reshape(1, -1) |
|
|
|
|
|
env = ImprovedB747Env( |
|
|
initial_state=np.array([0.0, 0.0, 0.0, 0.0], dtype=float), |
|
|
reference_signal=ref, |
|
|
number_time_steps=ref.shape[1], |
|
|
dt=dt, |
|
|
include_reference_in_obs=True, |
|
|
reward_mode="step_response", |
|
|
) |
|
|
|
|
|
# Load pretrained agent |
|
|
agent = DSAC.from_pretrained("TensorAeroSpace/dsac-b747-step-response") |
|
|
agent.env = env |
|
|
agent.to_device(pick_device()) |
|
|
agent.eval() |
|
|
|
|
|
# Run evaluation |
|
|
obs, _ = env.reset() |
|
|
done = False |
|
|
total_reward = 0.0 |
|
|
|
|
|
while not done: |
|
|
action = agent.select_action(obs, evaluate=True) |
|
|
obs, reward, terminated, truncated, info = env.step(action) |
|
|
done = bool(terminated or truncated) |
|
|
total_reward += float(reward) |
|
|
|
|
|
print(f"Episode reward: {total_reward}") |
|
|
``` |
|
|
|
|
|
### Load from Local Checkpoint |
|
|
|
|
|
```python |
|
|
from tensoraerospace.agent import DSAC |
|
|
|
|
|
# Load from local directory |
|
|
agent = DSAC.from_pretrained("./path/to/checkpoint") |
|
|
``` |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- **Fixed Aircraft Model**: Trained specifically on Boeing 747 longitudinal dynamics; may not generalize to other aircraft |
|
|
- **Step Reference Only**: Optimized for step reference tracking; performance on other signal types (sine, ramp) may vary |
|
|
- **Simulation Gap**: Trained in simulation; real-world deployment would require additional validation |
|
|
- **State Observability**: Assumes all longitudinal states are observable |
|
|
- **Linear Dynamics**: Based on linearized aircraft model around trim conditions |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
- **Not for Real Flight Control**: This model is for research and educational purposes only. It should NOT be used for actual aircraft control systems without extensive testing, certification, and regulatory approval. |
|
|
- **Simulation Only**: All training and evaluation performed in simulation environments. |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@software{tensoraerospace2024, |
|
|
title = {TensorAeroSpace: Advanced Aerospace Control Systems \& Reinforcement Learning Framework}, |
|
|
author = {TensorAeroSpace Team}, |
|
|
year = {2024}, |
|
|
url = {https://github.com/TensorAeroSpace/TensorAeroSpace}, |
|
|
license = {MIT} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Model Card Authors |
|
|
|
|
|
TensorAeroSpace Team |
|
|
|
|
|
## Model Card Contact |
|
|
|
|
|
- **GitHub**: [TensorAeroSpace/TensorAeroSpace](https://github.com/TensorAeroSpace/TensorAeroSpace) |
|
|
- **Documentation**: [tensoraerospace.readthedocs.io](https://tensoraerospace.readthedocs.io/) |
|
|
- **Hugging Face**: [TensorAeroSpace](https://huggingface.co/TensorAeroSpace) |
|
|
|
|
|
|