---
library_name: tensoraerospace
tags:
  - reinforcement-learning
  - control
  - aerospace
  - boeing-747
  - gymnasium
  - sac
license: mit
datasets: []
language: []
model-index:
  - name: SAC Boeing 747 Pitch Control (ImprovedB747Env)
    results: []
---

# SAC Boeing 747 Pitch Control (ImprovedB747Env)

This model is a Soft Actor-Critic (SAC) agent trained to control the pitch channel of a Boeing 747 in the `tensoraerospace.envs.b747.ImprovedB747Env` environment. The agent tracks a reference pitch profile while minimizing control effort and promoting smoothness.

## Model Details

- **Developed by:** TensorAeroSpace
- **Shared by:** TensorAeroSpace
- **Model type:** Reinforcement Learning — Soft Actor-Critic (continuous control)
- **Environment:** `tensoraerospace.envs.b747.ImprovedB747Env`
- **Action space:** normalized [-1, 1] (mapped to stabilizer angle ±25 deg)
- **Observation:** `[norm_pitch_error, norm_q, norm_theta, norm_prev_action]`
- **License:** MIT
- **Finetuned from:** Trained from scratch

### Sources

- **Repository:** https://github.com/tensoraerospace/tensoraerospace
- **Docs:** https://tensoraerospace.readthedocs.io/

## Uses

### Direct Use

Use the pretrained policy for simulation of pitch tracking tasks in the provided environment. Suitable for research and demonstration of RL-based flight control.

### Out-of-Scope Use

- Real aircraft control or safety-critical deployment without rigorous certification.
- Environments and state/action definitions that differ from `ImprovedB747Env`.

## How to Get Started

### Install

```bash
pip install tensoraerospace
```

### Load the Agent Locally

```python
from tensoraerospace.agent.sac import SAC

agent = SAC.from_pretrained(
    "./example/reinforcement_learning/best_episode_200k_episodes_0008_mae/Oct02_11-52-57_SAC/",
    load_gradients=False,  # set True to resume training with optimizer states
)

# Evaluate
obs, info = agent.env.reset()
done = False
while not done:
    action = agent.select_action(obs, evaluate=True)
    obs, reward, terminated, truncated, info = agent.env.step(action)
    done = terminated or truncated
```

### Continue Training from Checkpoint

```python
from tensoraerospace.agent.sac import SAC

agent = SAC.from_pretrained(
    "./example/reinforcement_learning/best_episode_200k_episodes_0008_mae/Oct02_11-52-57_SAC/",
    load_gradients=True,
)

agent.train(num_episodes=10)
agent.save("./runs", save_gradients=True)
```

## Training Details

The saved `config.json` contains the exact environment and policy parameters used for training. Key entries:

- `env.name`: `tensoraerospace.envs.b747.ImprovedB747Env`
- `env.params`:
  - `initial_state`: `[0, 0, 0, 0]`
  - `reference_signal`: shape `(1, 201)` sinusoidal-like target for pitch
  - `number_time_steps`: `201`
- `policy.params`:
  - `gamma`: `0.99`
  - `tau`: `0.02`
  - `alpha`: `auto` via automatic entropy tuning
  - `batch_size`: `256`
  - `updates_per_step`: `2`
  - `target_update_interval`: `1`
  - `lr`: `3e-4`
  - `policy_type`: `Gaussian`
  - `device`: `cpu`

Note: With `automatic_entropy_tuning=True`, `log_alpha` and `alpha_optim` state are saved and can be restored.

## Evaluation

The agent was validated in simulation on the same environment by tracking the provided reference pitch signal over `201` steps. Reward aligns with negative quadratic costs on tracking error, pitch rate, control magnitude, smoothness, and jerk.

## Bias, Risks, and Limitations

- Simulation fidelity limits real-world applicability.
- Trained on a specific reference and time horizon; generalization requires retraining.
- Safety constraints are implicit via reward shaping and bounds; not certified for real flight.

## Environmental Impact

Training performed on CPU for this checkpoint. For large-scale training, estimate CO2eq with the [ML CO2 Impact](https://mlco2.github.io/impact#compute) calculator.

## Technical Specs

- **Algorithm:** Soft Actor-Critic
- **Networks:** MLP policy and twin Q-networks (hidden size: 256 by default)
- **Frameworks:** PyTorch, Gymnasium

## Citation

If you use this model, please cite the TensorAeroSpace repository.

```bibtex
@misc{tensoraerospace,
  title        = {TensorAeroSpace: Aerospace Simulation and RL Framework},
  author       = {TensorAeroSpace contributors},
  year         = {2023},
  howpublished = {\url{https://github.com/tensoraerospace/tensoraerospace}},
}
```

## Model Card Authors

TensorAeroSpace Team

## Contact

For questions, please open an issue at the repository or email support@tensoraerospace.org.