--- datasets: - inaturalist2019 language: en tags: - image-classification - pytorch - efficientnet - mixture-of-experts - deepmoe --- # DeepMoE EfficientNet-B0 fine-tuned on iNaturalist 2019 This model is a Mixture-of-Experts (DeepMoE) variant of EfficientNet-B0, fine-tuned on the iNaturalist 2019 dataset to optimize both accuracy and computational efficiency (FLOP reduction). ## Training Results - **Final Score (Acc/FLOPs composite)**: 83.4129 - **Final Validation Accuracy**: 69.1% - **Expert Activation Ratio**: 32.0% - **FLOPs Usage**: 57.2% *(compared to baseline B0)* - **Baseline B0 Reference FLOPs**: 388,184,000 - **Total Runtime**: 5204.99 seconds ## Hyperparameters - **Batch Size**: 256 - **Gradient Accumulation Steps**: 4 - **Weight Decay**: 0.005 ### Epochs - **Total Epochs**: 10 - Joint Training Epochs: 10 - Routing-Frozen Finetuning Epochs: 0 ### DeepMoE Architecture & Routing - **MoE Start Stage**: 1 - **Latent Dimension**: 32 - **Sparsity Penalty ($\lambda_g$)**: 7e-05 - **Target Sparsity ($\mu$)**: 0.5 - **ReLU Init (Val / Std)**: 1 / 1 ### Learning Rates - **MoE Routing Parameters**: 1.20e-01 - **Classification Head**: 2.00e-02 - **Base Model (Body)**: 2.00e-03 - **Finetune Phase (Frozen Routing)**: 0.00e+00 *Training was tracked using [Weights & Biases](https://wandb.ai).*