--- library_name: tensoraerospace tags: - reinforcement-learning - control - aerospace - boeing-747 - gymnasium - sac license: mit datasets: [] language: [] model-index: - name: SAC Boeing 747 Pitch Control (ImprovedB747Env) results: [] --- # SAC Boeing 747 Pitch Control (ImprovedB747Env) This model is a Soft Actor-Critic (SAC) agent trained to control the pitch channel of a Boeing 747 in the `tensoraerospace.envs.b747.ImprovedB747Env` environment. The agent tracks a reference pitch profile while minimizing control effort and promoting smoothness. ## Model Details - **Developed by:** TensorAeroSpace - **Shared by:** TensorAeroSpace - **Model type:** Reinforcement Learning — Soft Actor-Critic (continuous control) - **Environment:** `tensoraerospace.envs.b747.ImprovedB747Env` - **Action space:** normalized [-1, 1] (mapped to stabilizer angle ±25 deg) - **Observation:** `[norm_pitch_error, norm_q, norm_theta, norm_prev_action]` - **License:** MIT - **Finetuned from:** Trained from scratch ### Sources - **Repository:** https://github.com/tensoraerospace/tensoraerospace - **Docs:** https://tensoraerospace.readthedocs.io/ ## Uses ### Direct Use Use the pretrained policy for simulation of pitch tracking tasks in the provided environment. Suitable for research and demonstration of RL-based flight control. ### Out-of-Scope Use - Real aircraft control or safety-critical deployment without rigorous certification. - Environments and state/action definitions that differ from `ImprovedB747Env`. ## How to Get Started ### Install ```bash pip install tensoraerospace ``` ### Load the Agent Locally ```python from tensoraerospace.agent.sac import SAC agent = SAC.from_pretrained( "./example/reinforcement_learning/best_episode_200k_episodes_0008_mae/Oct02_11-52-57_SAC/", load_gradients=False, # set True to resume training with optimizer states ) # Evaluate obs, info = agent.env.reset() done = False while not done: action = agent.select_action(obs, evaluate=True) obs, reward, terminated, truncated, info = agent.env.step(action) done = terminated or truncated ``` ### Continue Training from Checkpoint ```python from tensoraerospace.agent.sac import SAC agent = SAC.from_pretrained( "./example/reinforcement_learning/best_episode_200k_episodes_0008_mae/Oct02_11-52-57_SAC/", load_gradients=True, ) agent.train(num_episodes=10) agent.save("./runs", save_gradients=True) ``` ## Training Details The saved `config.json` contains the exact environment and policy parameters used for training. Key entries: - `env.name`: `tensoraerospace.envs.b747.ImprovedB747Env` - `env.params`: - `initial_state`: `[0, 0, 0, 0]` - `reference_signal`: shape `(1, 201)` sinusoidal-like target for pitch - `number_time_steps`: `201` - `policy.params`: - `gamma`: `0.99` - `tau`: `0.02` - `alpha`: `auto` via automatic entropy tuning - `batch_size`: `256` - `updates_per_step`: `2` - `target_update_interval`: `1` - `lr`: `3e-4` - `policy_type`: `Gaussian` - `device`: `cpu` Note: With `automatic_entropy_tuning=True`, `log_alpha` and `alpha_optim` state are saved and can be restored. ## Evaluation The agent was validated in simulation on the same environment by tracking the provided reference pitch signal over `201` steps. Reward aligns with negative quadratic costs on tracking error, pitch rate, control magnitude, smoothness, and jerk. ## Bias, Risks, and Limitations - Simulation fidelity limits real-world applicability. - Trained on a specific reference and time horizon; generalization requires retraining. - Safety constraints are implicit via reward shaping and bounds; not certified for real flight. ## Environmental Impact Training performed on CPU for this checkpoint. For large-scale training, estimate CO2eq with the [ML CO2 Impact](https://mlco2.github.io/impact#compute) calculator. ## Technical Specs - **Algorithm:** Soft Actor-Critic - **Networks:** MLP policy and twin Q-networks (hidden size: 256 by default) - **Frameworks:** PyTorch, Gymnasium ## Citation If you use this model, please cite the TensorAeroSpace repository. ```bibtex @misc{tensoraerospace, title = {TensorAeroSpace: Aerospace Simulation and RL Framework}, author = {TensorAeroSpace contributors}, year = {2023}, howpublished = {\url{https://github.com/tensoraerospace/tensoraerospace}}, } ``` ## Model Card Authors TensorAeroSpace Team ## Contact For questions, please open an issue at the repository or email support@tensoraerospace.org.