| | --- |
| | language: es |
| | tags: |
| | - PyTorch |
| | - EfficientNetV2 |
| | - regression |
| | - body-fat-estimation |
| | - DEXA |
| | - k-fold |
| | - optuna |
| | - hyperparameter-optimization |
| | base_model: torchvision/efficientnet_v2_s |
| | --- |
| | |
| | # EfficientNetV2-S + Optuna — Estimador de Grasa Corporal |
| |
|
| | ## Descripción del modelo |
| |
|
| | Modelo basado en **EfficientNetV2-S** (pretrained en ImageNet) con la cabeza clasificadora |
| | reemplazada por un regresor MLP para estimar el porcentaje de grasa corporal total (WBFP) |
| | a partir de una fotografía frontal del torso. |
| |
|
| | Los hiperparámetros (lr, weight_decay, dropout) se optimizaron con **Optuna** usando 3-Fold CV |
| | y 25 trials. La evaluación final se realizó con **10-Fold Cross-Validation**. |
| | |
| | --- |
| | |
| | ## Resultados (10-Fold CV) |
| | |
| | | Métrica | Media ± Std | |
| | |---------|-------------| |
| | | SEE | 3.866 ± 0.724 | |
| | | Pearson | 0.842 ± 0.073 | |
| | | MAE | 3.003 ± 0.375 | |
| | |
| | Mejor fold: 8 | SEE: 2.8911 |
| | |
| | --- |
| | |
| | ## Mejores hiperparámetros (Optuna) |
| | |
| | | Parámetro | Valor | |
| | |--------------|-------| |
| | | Learning Rate | 0.000165 | |
| | | Weight Decay | 0.000881 | |
| | | Dropout | 0.123 | |
| | |
| | --- |
| | |
| | ## Proceso de entrenamiento |
| | |
| | | Parámetro | Valor | |
| | |-----------|-------| |
| | | Base model | EfficientNetV2-S (torchvision) | |
| | | Dataset | MasterMIARFID/peoplebf_dexa | |
| | | Optimizador | AdamW | |
| | | Loss | SmoothL1Loss | |
| | | Scheduler | CosineAnnealingLR | |
| | | Tuning | Optuna (25 trials, 3-Fold CV) | |
| | | K-Fold Final | 10 | |
| | | Epochs | 50 | |
| | | Patience | 10 | |
| | | Batch Size | 8 | |
| |
|
| | --- |
| |
|
| | ## Transformaciones de imagen |
| |
|
| | ### Entrenamiento |
| | ``` |
| | Resize(224, 224) |
| | RandomHorizontalFlip(p=0.5) |
| | RandomRotation(10) |
| | RandomAffine(degrees=0, translate=(0.05, 0.05)) |
| | ColorJitter(brightness=0.1, contrast=0.1) |
| | ToTensor() |
| | Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) |
| | ``` |
| |
|
| | ### Validación/Inferencia |
| | ``` |
| | Resize(224, 224) |
| | ToTensor() |
| | Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Cómo cargar y usar el modelo |
| |
|
| | ```python |
| | import torch |
| | import torch.nn as nn |
| | import torchvision.models as models |
| | import torchvision.transforms as T |
| | from PIL import Image |
| | |
| | def create_efficientnet_regressor(dropout=0.123): |
| | model = models.efficientnet_v2_s(weights=None) |
| | in_features = model.classifier[1].in_features |
| | model.classifier = nn.Sequential( |
| | nn.Dropout(p=dropout), |
| | nn.Linear(in_features, 256), |
| | nn.ReLU(), |
| | nn.Dropout(p=dropout / 2), |
| | nn.Linear(256, 1) |
| | ) |
| | return model |
| | |
| | transform = T.Compose([ |
| | T.Resize((224, 224)), |
| | T.ToTensor(), |
| | T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), |
| | ]) |
| | |
| | model = create_efficientnet_regressor() |
| | model.load_state_dict(torch.load('efficientnet_optuna_bodyfat_regressor.pth', weights_only=True)) |
| | model.eval() |
| | |
| | img = Image.open('foto_torso_frente.jpg').convert('RGB') |
| | x = transform(img).unsqueeze(0) |
| | |
| | with torch.no_grad(): |
| | wbfp = model(x).item() |
| | |
| | print(f'WBFP estimado: {wbfp:.2f} %') |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Tipo de imagen esperada |
| |
|
| | El modelo espera **imágenes frontales del torso** (vista de frente, cuerpo centrado, |
| | sin ropa o con ropa ajustada). Las imágenes son RGB estándar. |
| |
|