sac-b747 / README.md
Mr8bit's picture
Create README.md
a2d051f verified
---
library_name: tensoraerospace
tags:
- reinforcement-learning
- control
- aerospace
- boeing-747
- gymnasium
- sac
license: mit
datasets: []
language: []
model-index:
- name: SAC Boeing 747 Pitch Control (ImprovedB747Env)
results: []
---
# SAC Boeing 747 Pitch Control (ImprovedB747Env)
This model is a Soft Actor-Critic (SAC) agent trained to control the pitch channel of a Boeing 747 in the `tensoraerospace.envs.b747.ImprovedB747Env` environment. The agent tracks a reference pitch profile while minimizing control effort and promoting smoothness.
## Model Details
- **Developed by:** TensorAeroSpace
- **Shared by:** TensorAeroSpace
- **Model type:** Reinforcement Learning — Soft Actor-Critic (continuous control)
- **Environment:** `tensoraerospace.envs.b747.ImprovedB747Env`
- **Action space:** normalized [-1, 1] (mapped to stabilizer angle ±25 deg)
- **Observation:** `[norm_pitch_error, norm_q, norm_theta, norm_prev_action]`
- **License:** MIT
- **Finetuned from:** Trained from scratch
### Sources
- **Repository:** https://github.com/tensoraerospace/tensoraerospace
- **Docs:** https://tensoraerospace.readthedocs.io/
## Uses
### Direct Use
Use the pretrained policy for simulation of pitch tracking tasks in the provided environment. Suitable for research and demonstration of RL-based flight control.
### Out-of-Scope Use
- Real aircraft control or safety-critical deployment without rigorous certification.
- Environments and state/action definitions that differ from `ImprovedB747Env`.
## How to Get Started
### Install
```bash
pip install tensoraerospace
```
### Load the Agent Locally
```python
from tensoraerospace.agent.sac import SAC
agent = SAC.from_pretrained(
"./example/reinforcement_learning/best_episode_200k_episodes_0008_mae/Oct02_11-52-57_SAC/",
load_gradients=False, # set True to resume training with optimizer states
)
# Evaluate
obs, info = agent.env.reset()
done = False
while not done:
action = agent.select_action(obs, evaluate=True)
obs, reward, terminated, truncated, info = agent.env.step(action)
done = terminated or truncated
```
### Continue Training from Checkpoint
```python
from tensoraerospace.agent.sac import SAC
agent = SAC.from_pretrained(
"./example/reinforcement_learning/best_episode_200k_episodes_0008_mae/Oct02_11-52-57_SAC/",
load_gradients=True,
)
agent.train(num_episodes=10)
agent.save("./runs", save_gradients=True)
```
## Training Details
The saved `config.json` contains the exact environment and policy parameters used for training. Key entries:
- `env.name`: `tensoraerospace.envs.b747.ImprovedB747Env`
- `env.params`:
- `initial_state`: `[0, 0, 0, 0]`
- `reference_signal`: shape `(1, 201)` sinusoidal-like target for pitch
- `number_time_steps`: `201`
- `policy.params`:
- `gamma`: `0.99`
- `tau`: `0.02`
- `alpha`: `auto` via automatic entropy tuning
- `batch_size`: `256`
- `updates_per_step`: `2`
- `target_update_interval`: `1`
- `lr`: `3e-4`
- `policy_type`: `Gaussian`
- `device`: `cpu`
Note: With `automatic_entropy_tuning=True`, `log_alpha` and `alpha_optim` state are saved and can be restored.
## Evaluation
The agent was validated in simulation on the same environment by tracking the provided reference pitch signal over `201` steps. Reward aligns with negative quadratic costs on tracking error, pitch rate, control magnitude, smoothness, and jerk.
## Bias, Risks, and Limitations
- Simulation fidelity limits real-world applicability.
- Trained on a specific reference and time horizon; generalization requires retraining.
- Safety constraints are implicit via reward shaping and bounds; not certified for real flight.
## Environmental Impact
Training performed on CPU for this checkpoint. For large-scale training, estimate CO2eq with the [ML CO2 Impact](https://mlco2.github.io/impact#compute) calculator.
## Technical Specs
- **Algorithm:** Soft Actor-Critic
- **Networks:** MLP policy and twin Q-networks (hidden size: 256 by default)
- **Frameworks:** PyTorch, Gymnasium
## Citation
If you use this model, please cite the TensorAeroSpace repository.
```bibtex
@misc{tensoraerospace,
title = {TensorAeroSpace: Aerospace Simulation and RL Framework},
author = {TensorAeroSpace contributors},
year = {2023},
howpublished = {\url{https://github.com/tensoraerospace/tensoraerospace}},
}
```
## Model Card Authors
TensorAeroSpace Team
## Contact
For questions, please open an issue at the repository or email support@tensoraerospace.org.