# AdaFortiTran: Adaptive Transformer Model for Robust OFDM Channel Estimation

[![License](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://www.python.org/)
[![PyTorch](https://img.shields.io/badge/PyTorch-1.8+-red.svg)](https://pytorch.org/)

Official implementation of [AdaFortiTran: An Adaptive Transformer Model for Robust OFDM Channel Estimation](https://arxiv.org/abs/2505.09076) accepted at ICC 2025, Montreal, Canada.

## 📖 Overview

AdaFortiTran is a novel adaptive transformer-based model for OFDM channel estimation that dynamically adapts to varying channel conditions (SNR, delay spread, Doppler shift). The model combines the power of transformer architectures with channel-aware adaptation mechanisms to achieve robust performance across diverse wireless environments.

### Key Features
- **🔄 Adaptive Architecture**: Dynamically adapts to channel conditions using meta-information
- **⚡ High Performance**: State-of-the-art results on OFDM channel estimation tasks
- **🧠 Transformer-Based**: Leverages attention mechanisms for long-range dependencies
- **🎯 Robust**: Maintains performance across varying SNR, delay spread, and Doppler conditions
- **🚀 Production Ready**: Comprehensive training pipeline with advanced features

## 🏗️ Architecture

The project implements three model variants:

1. **Linear Estimator**: Simple learned linear transformation baseline
2. **FortiTran**: Fixed transformer-based channel estimator
3. **AdaFortiTran**: Adaptive transformer with channel condition awareness

### Model Comparison

| Model | Channel Adaptation | Complexity | Performance |
|-------|-------------------|------------|-------------|
| Linear | ❌ | Low | Baseline |
| FortiTran | ❌ | Medium | Good |
| AdaFortiTran | ✅ | High | **Best** |

## 🚀 Quick Start

### Installation

1. **Clone the repository**:
   ```bash
   git clone https://github.com/your-username/AdaFortiTran.git
   cd AdaFortiTran
   ```

2. **Install dependencies**:
   ```bash
   pip install -r requirements.txt
   ```

3. **Verify installation**:
   ```bash
   python -c "import torch; print(f'PyTorch {torch.__version__}')"
   ```

### Basic Training

Train an AdaFortiTran model with default settings:

```bash
python src/main.py \
    --model_name adafortitran \
    --system_config_path config/system_config.yaml \
    --model_config_path config/adafortitran.yaml \
    --train_set data/train \
    --val_set data/val \
    --test_set data/test \
    --exp_id my_experiment
```

### Advanced Training

Use all available features for optimal performance:

```bash
python src/main.py \
    --model_name adafortitran \
    --system_config_path config/system_config.yaml \
    --model_config_path config/adafortitran.yaml \
    --train_set data/train \
    --val_set data/val \
    --test_set data/test \
    --exp_id advanced_experiment \
    --batch_size 128 \
    --lr 5e-4 \
    --max_epoch 100 \
    --patience 10 \
    --weight_decay 1e-4 \
    --gradient_clip_val 1.0 \
    --use_mixed_precision \
    --save_every_n_epochs 5 \
    --num_workers 8 \
    --test_every_n 5
```

## 📁 Project Structure

```
AdaFortiTran/
├── config/                     # Configuration files
│   ├── system_config.yaml     # OFDM system parameters
│   ├── adafortitran.yaml      # AdaFortiTran model config
│   ├── fortitran.yaml         # FortiTran model config
│   └── linear.yaml            # Linear model config
├── data/                      # Dataset directory
│   ├── train/                 # Training data
│   ├── val/                   # Validation data
│   └── test/                  # Test data (DS, MDS, SNR sets)
├── src/                       # Source code
│   ├── main/                  # Training pipeline
│   │   ├── trainer.py         # Enhanced ModelTrainer
│   │   └── parser.py          # Command-line argument parser
│   ├── models/                # Model implementations
│   │   ├── adafortitran.py    # AdaFortiTran model
│   │   ├── fortitran.py       # FortiTran model
│   │   ├── linear.py          # Linear model
│   │   └── blocks/            # Model building blocks
│   ├── data/                  # Data loading
│   │   └── dataset.py         # Dataset and DataLoader classes
│   ├── config/                # Configuration management
│   │   ├── config_loader.py   # YAML configuration loader
│   │   └── schemas.py         # Pydantic validation schemas
│   └── utils.py               # Utility functions
├── requirements.txt           # Python dependencies
├── README.md                  # This file
```

## ⚙️ Configuration

### System Configuration (`config/system_config.yaml`)

Defines OFDM system parameters:

```yaml
ofdm:
  num_scs: 120      # Number of subcarriers
  num_symbols: 14   # Number of OFDM symbols

pilot:
  num_scs: 12       # Number of pilot subcarriers
  num_symbols: 2    # Number of pilot symbols
```

### Model Configuration (`config/adafortitran.yaml`)

Defines model architecture parameters:

```yaml
model_type: 'adafortitran'
patch_size: [3, 2]                    # Patch dimensions
num_layers: 6                         # Transformer layers
model_dim: 128                        # Model dimension
num_head: 4                           # Attention heads
activation: 'gelu'                    # Activation function
dropout: 0.1                          # Dropout rate
max_seq_len: 512                      # Maximum sequence length
pos_encoding_type: 'learnable'        # Positional encoding
channel_adaptivity_hidden_sizes: [7, 42, 560]  # Adaptation layers
adaptive_token_length: 6              # Adaptive token length
```

## 🎯 Training Features

### Advanced Training Options

| Feature | Description | Default |
|---------|-------------|---------|
| `--use_mixed_precision` | Enable mixed precision training | False |
| `--gradient_clip_val` | Gradient clipping value | None |
| `--weight_decay` | Weight decay for optimizer | 0.0 |
| `--save_checkpoints` | Enable model checkpointing | True |
| `--save_best_only` | Save only best model | True |
| `--resume_from_checkpoint` | Resume from checkpoint | None |
| `--num_workers` | Data loading workers | 4 |
| `--pin_memory` | Pin memory for GPU | True |

### Callback System

The training pipeline includes an extensible callback system:

- **TensorBoard Logging**: Automatic metric tracking and visualization
- **Checkpoint Management**: Flexible checkpoint saving strategies
- **Custom Callbacks**: Easy to add new logging or monitoring systems

### Performance Optimizations

- **Mixed Precision Training**: Faster training on modern GPUs
- **Optimized Data Loading**: Configurable workers and memory pinning
- **Gradient Clipping**: Stable training with configurable clipping
- **Early Stopping**: Automatic training termination on plateau

## 📊 Dataset Format

### Expected File Structure

```
data/
├── train/
│   ├── 1_SNR-20_DS-50_DOP-500_N-3_TDL-A.mat
│   ├── 2_SNR-20_DS-50_DOP-500_N-3_TDL-A.mat
│   └── ...
├── val/
│   └── ...
└── test/
    ├── DS_test_set/          # Delay Spread tests
    │   ├── DS_50/
    │   ├── DS_100/
    │   └── ...
    ├── SNR_test_set/         # SNR tests
    │   ├── SNR_10/
    │   ├── SNR_20/
    │   └── ...
    └── MDS_test_set/         # Multi-Doppler tests
        ├── DOP_200/
        ├── DOP_400/
        └── ...
```

### File Naming Convention

Files must follow the pattern:
```
{file_number}_SNR-{snr}_DS-{delay_spread}_DOP-{doppler}_N-{pilot_freq}_{channel_type}.mat
```

Example: `1_SNR-20_DS-50_DOP-500_N-3_TDL-A.mat`

### Data Format

Each `.mat` file must contain variable `H` with shape `[subcarriers, symbols, 3]`:
- `H[:, :, 0]`: Ground truth channel (complex values)
- `H[:, :, 1]`: LS channel estimate with zeros for non-pilot positions
- `H[:, :, 2]`: Reserved for future use

## 🔧 Usage Examples

### Training Different Models

**Linear Estimator**:
```bash
python src/main.py \
    --model_name linear \
    --system_config_path config/system_config.yaml \
    --model_config_path config/linear.yaml \
    --train_set data/train \
    --val_set data/val \
    --test_set data/test \
    --exp_id linear_baseline
```

**FortiTran**:
```bash
python src/main.py \
    --model_name fortitran \
    --system_config_path config/system_config.yaml \
    --model_config_path config/fortitran.yaml \
    --train_set data/train \
    --val_set data/val \
    --test_set data/test \
    --exp_id fortitran_experiment
```

**AdaFortiTran**:
```bash
python src/main.py \
    --model_name adafortitran \
    --system_config_path config/system_config.yaml \
    --model_config_path config/adafortitran.yaml \
    --train_set data/train \
    --val_set data/val \
    --test_set data/test \
    --exp_id adafortitran_experiment
```

### Resume Training

```bash
python src/main.py \
    --model_name adafortitran \
    --system_config_path config/system_config.yaml \
    --model_config_path config/adafortitran.yaml \
    --train_set data/train \
    --val_set data/val \
    --test_set data/test \
    --exp_id resumed_experiment \
    --resume_from_checkpoint runs/adafortitran_experiment/best/checkpoint_epoch_50.pt
```

### Hyperparameter Tuning

```bash
python src/main.py \
    --model_name adafortitran \
    --system_config_path config/system_config.yaml \
    --model_config_path config/adafortitran.yaml \
    --train_set data/train \
    --val_set data/val \
    --test_set data/test \
    --exp_id hyperparameter_tuning \
    --batch_size 64 \
    --lr 1e-3 \
    --max_epoch 50 \
    --patience 5 \
    --weight_decay 1e-5 \
    --gradient_clip_val 0.5 \
    --use_mixed_precision \
    --test_every_n 5
```

## 📈 Monitoring and Logging

### TensorBoard Integration

Training automatically logs metrics to TensorBoard:

```bash
tensorboard --logdir runs/
```

Available metrics:
- Training/validation loss
- Learning rate
- Test performance across conditions
- Error visualizations
- Model hyperparameters

### Log Files

Training logs are saved to:
- `logs/training_{exp_id}.log`: Python logging output
- `runs/{model_name}_{exp_id}/`: TensorBoard logs and checkpoints

## 🧪 Testing and Evaluation

### Automatic Testing

The training pipeline automatically evaluates models on:
- **DS (Delay Spread)**: Varying delay spread conditions
- **SNR**: Different signal-to-noise ratios
- **MDS (Multi-Doppler)**: Various Doppler shift scenarios

### Manual Evaluation

```python
from src.models import AdaFortiTranEstimator
from src.config import load_config

# Load configurations
system_config, model_config = load_config(
    'config/system_config.yaml', 
    'config/adafortitran.yaml'
)

# Initialize model
model = AdaFortiTranEstimator(system_config, model_config)

# Load checkpoint
checkpoint = torch.load('checkpoint.pt')
model.load_state_dict(checkpoint['model_state_dict'])

# Evaluate
model.eval()
# ... evaluation code
```

## 🔬 Research and Development

### Adding Custom Callbacks

```python
from src.main.trainer import Callback, TrainingMetrics

class CustomCallback(Callback):
    def on_epoch_end(self, epoch: int, metrics: TrainingMetrics) -> None:
        # Custom logic here
        print(f"Epoch {epoch}: Train Loss = {metrics.train_loss:.4f}")
```

### Extending Models

The modular architecture makes it easy to add new model variants:

```python
from src.models.fortitran import BaseFortiTranEstimator

class CustomEstimator(BaseFortiTranEstimator):
    def __init__(self, system_config, model_config):
        super().__init__(system_config, model_config, use_channel_adaptation=True)
        # Add custom components
```

## 🐛 Troubleshooting

### Common Issues

**CUDA Out of Memory**:
- Reduce batch size: `--batch_size 32`
- Enable mixed precision: `--use_mixed_precision`
- Reduce number of workers: `--num_workers 2`

**Slow Training**:
- Increase number of workers: `--num_workers 8`
- Enable pin memory: `--pin_memory`
- Use mixed precision: `--use_mixed_precision`

**Poor Convergence**:
- Adjust learning rate: `--lr 1e-4`
- Add gradient clipping: `--gradient_clip_val 1.0`
- Increase patience: `--patience 10`

### Getting Help

1. Check the logs in `logs/training_{exp_id}.log`
2. Verify dataset format matches requirements
3. Ensure all dependencies are installed correctly
4. Check TensorBoard for training curves

## 📚 Citation

If you use this code in your research, please cite:

```bibtex
@misc{guler2025adafortitranadaptivetransformermodel,
      title={AdaFortiTran: An Adaptive Transformer Model for Robust OFDM Channel Estimation}, 
      author={Berkay Guler and Hamid Jafarkhani},
      year={2025},
      eprint={2505.09076},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.09076}, 
}
```

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

Copyright (c) 2025 [Berkay Guler/University of California, Irvine]