| # Training Report - Ensemble | |
| Generated: 2025-09-08 18:12:05 | |
| ## Overview | |
| - **Command**: `ensemble` | |
| - **Training Duration**: 6144.87 seconds (102.4 minutes) | |
| - **Output Directory**: `output/ensemble_20250908_162940` | |
| ## Dataset Information | |
| - **Total Records**: 25,512 | |
| - **Training Steps per Epoch**: 637 | |
| - **Validation Steps per Epoch**: 159 | |
| ### Vocabulary Sizes | |
| - **Stations**: 6 unique stations | |
| - **Routes**: 13 unique routes | |
| - **Tracks**: 13 unique tracks (prediction targets) | |
| ## Training Configuration | |
| - **Num Models**: 6 | |
| - **Epochs**: 1000 | |
| - **Batch Size**: 32 | |
| - **Base Learning Rate**: 0.001 | |
| - **Dataset Size**: 25512 | |
| - **Bagging Fraction**: 1.0 | |
| - **Seed Base**: 42 | |
| ## Final Performance Metrics | |
| - **Average Validation Loss**: 0.9233 | |
| - **Average Validation Accuracy**: 0.7460 | |
| - **Best Individual Accuracy**: 0.7720 | |
| - **Worst Individual Accuracy**: 0.7256 | |
| - **Ensemble Std Accuracy**: 0.0181 | |
| ## Additional Information | |
| - **Individual Model Metrics**: {'model_index': 0, 'validation_loss': 0.8818949460983276, 'validation_accuracy': 0.7720125913619995, 'learning_rate': 0.0011677725896206085, 'parameters': 53384}, {'model_index': 1, 'validation_loss': 0.9238271713256836, 'validation_accuracy': 0.7682783007621765, 'learning_rate': 0.0010442193817105285, 'parameters': 156552}, {'model_index': 2, 'validation_loss': 0.9241461753845215, 'validation_accuracy': 0.7399764060974121, 'learning_rate': 0.00096484374873539, 'parameters': 14856}, {'model_index': 3, 'validation_loss': 0.9481979608535767, 'validation_accuracy': 0.7256289124488831, 'learning_rate': 0.0009162122256532111, 'parameters': 14856}, {'model_index': 4, 'validation_loss': 0.9160543084144592, 'validation_accuracy': 0.7421383857727051, 'learning_rate': 0.0008199605784692232, 'parameters': 14856}, {'model_index': 5, 'validation_loss': 0.9454122185707092, 'validation_accuracy': 0.7279874086380005, 'learning_rate': 0.0009672535936195217, 'parameters': 14856} | |
| - **Ensemble Strategy**: Diverse architectures (deep, wide, standard) | |
| - **Learning Rate Variation**: 0.8x to 1.2x base rate with random variation | |
| - **Total Parameters**: 269360 | |
| ### Temperature Scaling | |
| - **Temperature**: 1.5000 | |
| - **Uncalibrated Nll**: 1.9108 | |
| - **Calibrated Nll**: 1.8276 | |
| - **Uncalibrated Ece**: 0.0939 | |
| - **Calibrated Ece**: 0.0302 | |
| ## Dataset Schema | |
| The model was trained on MBTA track assignment data with the following features: | |
| - **Categorical Features**: station_id, route_id, direction_id | |
| - **Temporal Features**: hour, minute, day_of_week (cyclically encoded) | |
| - **Target**: track_number (classification with 13 classes) | |
| ## Model Architecture | |
| - Embedding layers for categorical features | |
| - Cyclical time encoding (sin/cos) for temporal patterns | |
| - Dense layers with dropout regularization | |
| - Softmax output for multi-class track prediction | |
| ## Usage | |
| To load and use this model: | |
| ```python | |
| import keras | |
| # Load for inference (optimizer not saved): | |
| model = keras.models.load_model('track_prediction_ensemble_final.keras', compile=False) | |
| ``` | |
| --- | |
| *Report generated by imt-ml training pipeline* | |