File size: 6,894 Bytes

de3c81a

---
language: en
license: mit
tags: ["image-regression", "tensorflow", "mobilenetv2", "utkface", "age-estimation"]
datasets: ["UTKFace"]
metrics: ["mean_absolute_error"]
---

# UTKFace Age Regression — Model Card

This repository contains code to train a TensorFlow / Keras regression model that estimates a person's age from a face image using the UTKFace dataset. The model uses a MobileNetV2 backbone and a small regression head on top.

## Summary

- **Model type**: Image regression (single-output continuous)
- **Backbone**: MobileNetV2 (ImageNet pre-trained)
- **Task**: Age estimation (years)
- **Dataset**: UTKFace (public dataset; filenames encode age)
- **Reported metric**: Mean Absolute Error (MAE) — see Evaluation section for how to compute and report MAE for your runs

## Model details

- **Input**: RGB face image (recommended size: 224×224)
- **Output**: Single scalar value — predicted age in years
- **Preprocessing**: MobileNetV2 preprocessing (scales inputs to [-1, 1])
- **Loss**: Mean Squared Error (MSE) used during training
- **Metric for reporting**: Mean Absolute Error (MAE)

## Intended uses

- Research and educational purposes for learning about image regression and age estimation
- Prototyping demo applications that predict approximate age ranges from face crops

## Out-of-scope / Limitations

- This model provides an estimate of age; it's not a substitute for official identification
- Models trained on UTKFace carry dataset biases (race, gender, age distribution). They may underperform on underrepresented groups.
- Do not use this model for high-stakes decision making (employment, legal, medical, etc.)

## Dataset

**UTKFace**

- **Source**: https://susanqq.github.io/UTKFace/
- **Format**: Filenames encode metadata as `<age>_<gender>_<race>_<date&time>.jpg`.
- **Usage**: The training scripts in this repo extract the age from the filename (the integer before the first underscore).
- **Note**: Respect the dataset's license and authors when redistributing or publishing results.

## Training details

- **Framework**: TensorFlow / Keras
- **Backbone**: MobileNetV2 pretrained on ImageNet
- **Head**: GlobalAveragePooling2D -> Dense(128, relu) -> Dense(1, linear)
- **Recommended input size**: 224×224 (configurable via command-line args in `train.py`)
- **Batch size**: configurable (default set in `train.py`)
- **Optimizer**: Adam (default), learning rate and scheduler configurable in `train.py`
- **Loss**: Mean Squared Error (MSE)
- **Metric**: Mean Absolute Error (MAE) reported on validation/test sets
- **Augmentations**: Basic augmentations recommended (flip, random crop/brightness) for better robustness

## Reproducibility / Example training command

1. **Prepare UTKFace dataset**
   - Download and extract UTKFace images into `data/UTKFace/` or pass `--dataset_dir` to the training script.
2. **Install dependencies**
   - `python -m pip install -r requirements.txt`
3. **Train**
   - `python train.py --dataset_dir data/UTKFace --epochs 30 --batch_size 32 --img_size 224 --output_dir saved_model`

The `train.py` script builds a tf.data pipeline, extracts ages from filenames, constructs a MobileNetV2-based model, and saves the trained model to the `--output_dir`.

## Evaluation and metrics (MAE)

Mean Absolute Error (MAE) gives an intuitive measure of average error in predicted age (in years):

```
MAE = mean(|y_true - y_pred|)
```

Compute MAE in Python (example):

```python
import numpy as np
mae = np.mean(np.abs(y_true - y_pred))
```

Example: the training script prints per-epoch validation MAE. To reproduce test MAE after training, run the provided evaluation routine or:

```python
from tensorflow import keras
import numpy as np
model = keras.models.load_model('saved_model')
# prepare test_images, test_labels arrays
preds = model.predict(test_images).squeeze()
mae = float(np.mean(np.abs(test_labels - preds)))
print('Test MAE (years):', mae)
```

Note: Exact MAE depends on preprocessing, train/validation split, augmentations, and hyperparameters. Report MAE alongside the exact training configuration for reproducibility.

## Usage — Quick examples

**Python (local SavedModel)**

```python
import tensorflow as tf
import numpy as np
from PIL import Image
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input

model = tf.keras.models.load_model('saved_model')  # path to a SavedModel directory
img = Image.open('path/to/face.jpg').convert('RGB').resize((224, 224))
arr = np.array(img, dtype=np.float32)
arr = preprocess_input(arr)
pred = model.predict(np.expand_dims(arr, 0))[0, 0]
print('Predicted age (years):', float(pred))
```

**Command-line (using predict.py)**

```
python predict.py --model_dir saved_model --image path/to/face.jpg
```

**Loading from Hugging Face Hub**

If you upload your saved model to the Hugging Face Hub, Consumers can download it using the `huggingface_hub` package. For example, in a Space, set the environment variable `HF_MODEL_ID` to the model repository (e.g. `username/my-age-model`) and the Gradio app supplied in this repo will attempt to download and use it.

**Gradio demo / Hugging Face Space**

A simple Gradio app is provided in `app.py` that:

- accepts an input face image
- preprocesses it (224×224 + MobileNetV2 preprocess)
- returns the predicted age (years) and the model's raw output

**How to host as a Space**

1. Create a new Space on Hugging Face and select "Gradio" as the SDK.
2. Push this repository to the Space (include `app.py`, your `saved_model/` directory or set `HF_MODEL_ID` to your model on the Hub).
3. Make sure `requirements.txt` includes `gradio` and `huggingface_hub` (the repository `requirements.txt` in this project may be extended with these packages for the Space).

## Files in this repository

- `train.py` — training script
- `predict.py` — single-image prediction helper
- `convert_model.py` — conversion helpers
- `inference_log.py`, `inference_log.txt`, `load_predict_log.txt` — logging and CLI helpers for inference (dev)
- `app.py` — (added) Gradio demo app for live predictions
- `requirements.txt` — Python dependencies (extend for Spaces with `gradio` and `huggingface_hub`)

## Security, biases and ethical considerations

- Age estimation models can reflect and amplify biases in the training data (race and gender imbalance, age distribution). Evaluate fairness across demographic slices before using widely.
- Avoid using the model in high-risk contexts where inaccurate age estimates could cause harm.

## How to cite / license

- UTKFace authors and dataset should be cited if you publish results.
- This repository is provided under the MIT license (see LICENSE file if present).

## Contact and credits

**Maintainer**: Stealth Labs Ltd.

**Acknowledgements**

Thanks to the UTKFace dataset authors for the publicly available images used in training and experimentation.