TensorAeroSpace
/

sac-b747

+---
+library_name: tensoraerospace
+tags:
+  - reinforcement-learning
+  - control
+  - aerospace
+  - boeing-747
+  - gymnasium
+  - sac
+license: mit
+datasets: []
+language: []
+model-index:
+  - name: SAC Boeing 747 Pitch Control (ImprovedB747Env)
+    results: []
+---
+# SAC Boeing 747 Pitch Control (ImprovedB747Env)
+This model is a Soft Actor-Critic (SAC) agent trained to control the pitch channel of a Boeing 747 in the `tensoraerospace.envs.b747.ImprovedB747Env` environment. The agent tracks a reference pitch profile while minimizing control effort and promoting smoothness.
+## Model Details
+- **Developed by:** TensorAeroSpace
+- **Shared by:** TensorAeroSpace
+- **Model type:** Reinforcement Learning — Soft Actor-Critic (continuous control)
+- **Environment:** `tensoraerospace.envs.b747.ImprovedB747Env`
+- **Action space:** normalized [-1, 1] (mapped to stabilizer angle ±25 deg)
+- **Observation:** `[norm_pitch_error, norm_q, norm_theta, norm_prev_action]`
+- **License:** MIT
+- **Finetuned from:** Trained from scratch
+### Sources
+- **Repository:** https://github.com/tensoraerospace/tensoraerospace
+- **Docs:** https://tensoraerospace.readthedocs.io/
+## Uses
+### Direct Use
+Use the pretrained policy for simulation of pitch tracking tasks in the provided environment. Suitable for research and demonstration of RL-based flight control.
+### Out-of-Scope Use
+- Real aircraft control or safety-critical deployment without rigorous certification.
+- Environments and state/action definitions that differ from `ImprovedB747Env`.
+## How to Get Started
+### Install
+```bash
+pip install tensoraerospace
+```
+### Load the Agent Locally
+```python
+from tensoraerospace.agent.sac import SAC
+agent = SAC.from_pretrained(
+    "./example/reinforcement_learning/best_episode_200k_episodes_0008_mae/Oct02_11-52-57_SAC/",
+    load_gradients=False,  # set True to resume training with optimizer states
+)
+# Evaluate
+obs, info = agent.env.reset()
+done = False
+while not done:
+    action = agent.select_action(obs, evaluate=True)
+    obs, reward, terminated, truncated, info = agent.env.step(action)
+    done = terminated or truncated
+```
+### Continue Training from Checkpoint
+```python
+from tensoraerospace.agent.sac import SAC
+agent = SAC.from_pretrained(
+    "./example/reinforcement_learning/best_episode_200k_episodes_0008_mae/Oct02_11-52-57_SAC/",
+    load_gradients=True,
+)
+agent.train(num_episodes=10)
+agent.save("./runs", save_gradients=True)
+```
+## Training Details
+The saved `config.json` contains the exact environment and policy parameters used for training. Key entries:
+- `env.name`: `tensoraerospace.envs.b747.ImprovedB747Env`
+- `env.params`:
+  - `initial_state`: `[0, 0, 0, 0]`
+  - `reference_signal`: shape `(1, 201)` sinusoidal-like target for pitch
+  - `number_time_steps`: `201`
+- `policy.params`:
+  - `gamma`: `0.99`
+  - `tau`: `0.02`
+  - `alpha`: `auto` via automatic entropy tuning
+  - `batch_size`: `256`
+  - `updates_per_step`: `2`
+  - `target_update_interval`: `1`
+  - `lr`: `3e-4`
+  - `policy_type`: `Gaussian`
+  - `device`: `cpu`
+Note: With `automatic_entropy_tuning=True`, `log_alpha` and `alpha_optim` state are saved and can be restored.
+## Evaluation
+The agent was validated in simulation on the same environment by tracking the provided reference pitch signal over `201` steps. Reward aligns with negative quadratic costs on tracking error, pitch rate, control magnitude, smoothness, and jerk.
+## Bias, Risks, and Limitations
+- Simulation fidelity limits real-world applicability.
+- Trained on a specific reference and time horizon; generalization requires retraining.
+- Safety constraints are implicit via reward shaping and bounds; not certified for real flight.
+## Environmental Impact
+Training performed on CPU for this checkpoint. For large-scale training, estimate CO2eq with the [ML CO2 Impact](https://mlco2.github.io/impact#compute) calculator.
+## Technical Specs
+- **Algorithm:** Soft Actor-Critic
+- **Networks:** MLP policy and twin Q-networks (hidden size: 256 by default)
+- **Frameworks:** PyTorch, Gymnasium
+## Citation
+If you use this model, please cite the TensorAeroSpace repository.
+```bibtex
+@misc{tensoraerospace,
+  title        = {TensorAeroSpace: Aerospace Simulation and RL Framework},
+  author       = {TensorAeroSpace contributors},
+  year         = {2023},
+  howpublished = {\url{https://github.com/tensoraerospace/tensoraerospace}},
+}
+```
+## Model Card Authors
+TensorAeroSpace Team
+## Contact
+For questions, please open an issue at the repository or email support@tensoraerospace.org.