Mr8bit commited on
Commit
b74c508
·
verified ·
1 Parent(s): fcca598

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +278 -0
README.md ADDED
@@ -0,0 +1,278 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - reinforcement-learning
7
+ - pytorch
8
+ - ppo
9
+ - aerospace
10
+ - flight-control
11
+ - boeing-747
12
+ - continuous-control
13
+ - gymnasium
14
+ library_name: tensoraerospace
15
+ pipeline_tag: reinforcement-learning
16
+ model-index:
17
+ - name: PPO-B747-PitchControl
18
+ results:
19
+ - task:
20
+ type: reinforcement-learning
21
+ name: Pitch Angle Tracking Control
22
+ dataset:
23
+ type: custom
24
+ name: Boeing 747 Longitudinal Dynamics Simulation
25
+ metrics:
26
+ - type: eval_reward
27
+ value: 0.9137
28
+ name: Best Evaluation Reward
29
+ - type: overshoot
30
+ value: 0.49
31
+ name: Overshoot (%)
32
+ - type: settling_time
33
+ value: 0.60
34
+ name: Settling Time (s)
35
+ - type: rise_time
36
+ value: 0.30
37
+ name: Rise Time (s)
38
+ - type: static_error
39
+ value: 0.0046
40
+ name: Static Error
41
+ ---
42
+
43
+ # PPO Agent for Boeing 747 Pitch Angle Control
44
+
45
+ <div align="center">
46
+
47
+ ![TensorAeroSpace](https://raw.githubusercontent.com/TensorAeroSpace/TensorAeroSpace/main/img/logo-no-background.png)
48
+
49
+ **Proximal Policy Optimization (PPO) for Longitudinal Aircraft Control**
50
+
51
+ [![TensorAeroSpace](https://img.shields.io/badge/%F0%9F%9A%80-TensorAeroSpace-blue)](https://github.com/TensorAeroSpace/TensorAeroSpace)
52
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
53
+ [![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)](https://pytorch.org/)
54
+
55
+ </div>
56
+
57
+ ## Model Description
58
+
59
+ This model is a **Proximal Policy Optimization (PPO)** agent trained to control the pitch angle (θ) of a **Boeing 747** aircraft in a longitudinal flight dynamics simulation. The agent receives normalized state observations and outputs continuous elevator deflection commands to track reference pitch angle signals.
60
+
61
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/602bf7c9c4f8038e9a1e0a65/g79y7SGa8VyXCDqDjd_GO.png)
62
+
63
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/602bf7c9c4f8038e9a1e0a65/OZcb5JP_txYA9WEqjHGa5.png)
64
+
65
+ ### Intended Uses
66
+
67
+ - **Primary Use**: Automatic pitch angle tracking and stabilization for Boeing 747 aircraft simulation
68
+ - **Research Applications**: Benchmarking RL algorithms for aerospace control systems
69
+ - **Educational**: Learning reinforcement learning concepts in aerospace applications
70
+ - **Hybrid Control**: Can be combined with PID/MPC controllers for robust flight control
71
+
72
+ ### Model Architecture
73
+
74
+ The PPO agent consists of separate **Actor** and **Critic** neural networks:
75
+
76
+ #### Actor Network (Policy)
77
+ | Layer | Configuration |
78
+ |-------|--------------|
79
+ | Input | 4 (observation dim) |
80
+ | Hidden 1 | Linear(4, 256) + ReLU |
81
+ | Hidden 2 | Linear(256, 256) + ReLU |
82
+ | Output (μ) | Linear(256, 1) + Tanh |
83
+ | Output (log σ) | Linear(256, 1), clamped to [-5.0, -1.5] |
84
+
85
+ #### Critic Network (Value Function)
86
+ | Layer | Configuration |
87
+ |-------|--------------|
88
+ | Input | 4 (observation dim) |
89
+ | Hidden 1 | Linear(4, 256) + ReLU |
90
+ | Hidden 2 | Linear(256, 256) + ReLU |
91
+ | Output | Linear(256, 1) |
92
+
93
+ ### State Space
94
+
95
+ The observation vector consists of 4 normalized states representing the longitudinal dynamics:
96
+
97
+ | Index | State | Description | Units |
98
+ |-------|-------|-------------|-------|
99
+ | 0 | u | Forward velocity perturbation | normalized |
100
+ | 1 | w | Vertical velocity perturbation | normalized |
101
+ | 2 | q | Pitch rate | normalized |
102
+ | 3 | θ | Pitch angle (tracking target) | normalized |
103
+
104
+ ### Action Space
105
+
106
+ | Dimension | Description | Range |
107
+ |-----------|-------------|-------|
108
+ | 1 | Elevator deflection | [-1.0, 1.0] (normalized) |
109
+
110
+ The normalized action is scaled to physical elevator deflection in degrees by the environment.
111
+
112
+ ## Training Details
113
+
114
+ ### Training Configuration
115
+
116
+ | Hyperparameter | Value |
117
+ |----------------|-------|
118
+ | Algorithm | PPO (Clip) |
119
+ | Max Episodes | 90,000 |
120
+ | Rollout Length | 256 steps |
121
+ | Batch Size | 16,384 |
122
+ | Epochs per Update | 2 |
123
+ | Clip Parameter (ε) | 0.15 |
124
+ | Discount Factor (γ) | 0.995 |
125
+ | GAE Lambda (λ) | 0.95 |
126
+ | Actor Learning Rate | 1e-4 |
127
+ | Critic Learning Rate | 2e-4 |
128
+ | Entropy Coefficient | 0.01 |
129
+ | Max Gradient Norm | 0.5 |
130
+ | Target KL | 0.01 |
131
+ | Normalize Observations | False |
132
+ | Normalize Rewards | True |
133
+
134
+ ### Environment Configuration
135
+
136
+ | Parameter | Value |
137
+ |-----------|-------|
138
+ | Environment | `ImprovedB747VecEnvTorch` |
139
+ | Number of Parallel Envs | 64 |
140
+ | Time Step (dt) | 0.1 s |
141
+ | Episode Duration | 20 s |
142
+ | Initial State | [0, 0, 0, 0] |
143
+ | Reference Signal | Step function |
144
+ | Step Amplitude Range | 1.0° |
145
+ | Step Time Range | 5.0 s |
146
+
147
+ ### Training Infrastructure
148
+
149
+ - **Hardware**: NVIDIA GPU with CUDA support
150
+ - **Framework**: PyTorch 2.0+
151
+ - **Training Time**: ~7,510 episodes to best checkpoint
152
+ - **Best Episode**: 7,510
153
+
154
+ ## Evaluation Results
155
+
156
+ ### Performance Metrics
157
+
158
+ | Metric | Value |
159
+ |--------|-------|
160
+ | **Best Evaluation Reward** | 0.9137 |
161
+ | **Overshoot** | 0.49% |
162
+ | **Settling Time** | 0.60 s |
163
+ | **Rise Time** | 0.30 s |
164
+ | **Peak Time** | 0.80 s |
165
+ | **Static Error** | -0.0046 |
166
+ | **Oscillation Count** | 1 |
167
+ | **Performance Index** | 3.06 |
168
+
169
+ ### Integral Criteria
170
+
171
+ | Criterion | Value |
172
+ |-----------|-------|
173
+ | IAE (Integral Absolute Error) | 4.08 |
174
+ | ISE (Integral Squared Error) | 2.64 |
175
+ | ITAE (Integral Time-weighted Absolute Error) | 4.77 |
176
+
177
+ ### Step Response Characteristics
178
+
179
+ The agent demonstrates excellent step tracking performance with:
180
+ - ✅ Minimal overshoot (<1%)
181
+ - ✅ Fast settling time (0.6s)
182
+ - ✅ Quick rise time (0.3s)
183
+ - ✅ Near-zero static error
184
+ - ✅ Minimal oscillations (1 cycle)
185
+
186
+ ## Usage
187
+
188
+ ### Installation
189
+
190
+ ```bash
191
+ pip install tensoraerospace
192
+ ```
193
+
194
+ ### Quick Start
195
+
196
+ ```python
197
+ import numpy as np
198
+ import torch
199
+ from tensoraerospace.agent.ppo.model import PPO
200
+ from tensoraerospace.envs.b747 import ImprovedB747Env
201
+ from tensoraerospace.signals.standart import unit_step
202
+ from tensoraerospace.utils import generate_time_period, convert_tp_to_sec_tp
203
+
204
+ # Load pretrained agent
205
+ agent = PPO.from_pretrained("TensorAeroSpace/ppo-b747-pitch-control")
206
+
207
+ # Setup environment
208
+ dt = 0.1
209
+ tp = generate_time_period(tn=20, dt=dt)
210
+ tps = convert_tp_to_sec_tp(tp, dt=dt)
211
+
212
+ # Create step reference signal (1 degree step at t=5s)
213
+ reference = unit_step(tp=tps, degree=1.0, time_step=5.0, output_rad=True).reshape(1, -1)
214
+
215
+ env = ImprovedB747Env(
216
+ initial_state=np.array([0.0, 0.0, 0.0, 0.0], dtype=np.float32),
217
+ reference_signal=reference,
218
+ number_time_steps=len(tp),
219
+ dt=dt,
220
+ )
221
+
222
+ # Run evaluation
223
+ obs, _ = env.reset()
224
+ done = False
225
+
226
+ while not done:
227
+ action, mean_action, _ = agent.act(obs, deterministic=True)
228
+ action_scalar = float(np.asarray(mean_action).flatten()[0])
229
+ obs, reward, terminated, truncated, info = env.step(action_scalar)
230
+ done = terminated or truncated
231
+ ```
232
+
233
+ ### Load from Local Checkpoint
234
+
235
+ ```python
236
+ from tensoraerospace.agent.ppo.model import PPO
237
+
238
+ # Load from local directory
239
+ agent = PPO.from_pretrained("./path/to/checkpoint")
240
+ ```
241
+
242
+ ## Limitations
243
+
244
+ - **Fixed Aircraft Model**: Trained specifically on Boeing 747 longitudinal dynamics; may not generalize to other aircraft
245
+ - **Step Reference Only**: Optimized for step reference tracking; performance on other signal types (sine, ramp) may vary
246
+ - **Simulation Gap**: Trained in simulation; real-world deployment would require additional validation
247
+ - **State Observability**: Assumes all 4 longitudinal states are observable
248
+ - **Linear Dynamics**: Based on linearized aircraft model around trim conditions
249
+
250
+ ## Ethical Considerations
251
+
252
+ - **Not for Real Flight Control**: This model is for research and educational purposes only. It should NOT be used for actual aircraft control systems without extensive testing, certification, and regulatory approval.
253
+ - **Simulation Only**: All training and evaluation performed in simulation environments.
254
+
255
+ ## Citation
256
+
257
+ If you use this model in your research, please cite:
258
+
259
+ ```bibtex
260
+ @software{tensoraerospace2024,
261
+ title = {TensorAeroSpace: Advanced Aerospace Control Systems \& Reinforcement Learning Framework},
262
+ author = {TensorAeroSpace Team},
263
+ year = {2024},
264
+ url = {https://github.com/TensorAeroSpace/TensorAeroSpace},
265
+ license = {MIT}
266
+ }
267
+ ```
268
+
269
+ ## Model Card Authors
270
+
271
+ TensorAeroSpace Team
272
+
273
+ ## Model Card Contact
274
+
275
+ - **GitHub**: [TensorAeroSpace/TensorAeroSpace](https://github.com/TensorAeroSpace/TensorAeroSpace)
276
+ - **Documentation**: [tensoraerospace.readthedocs.io](https://tensoraerospace.readthedocs.io/)
277
+ - **Hugging Face**: [TensorAeroSpace](https://huggingface.co/TensorAeroSpace)
278
+