|
|
--- |
|
|
license: cc-by-nc-4.0 |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- facebook/metaclip-2-worldwide-s16 |
|
|
pipeline_tag: image-classification |
|
|
library_name: transformers |
|
|
tags: |
|
|
- text-generation-inference |
|
|
- age-ange-estimator |
|
|
--- |
|
|
|
|
|
 |
|
|
|
|
|
# **MetaCLIP-2-Age-Range-Estimator** |
|
|
|
|
|
> **MetaCLIP-2-Age-Range-Estimator** is an image classification vision-language encoder model fine-tuned from **[facebook/metaclip-2-worldwide-s16](https://huggingface.co/facebook/metaclip-2-worldwide-s16)** for a single-label classification task. |
|
|
> It is designed to predict the age range of a person from an image using the **MetaClip2ForImageClassification** architecture. |
|
|
|
|
|
>[!note] |
|
|
MetaCLIP 2: A Worldwide Scaling Recipe : https://huggingface.co/papers/2507.22062 |
|
|
|
|
|
``` |
|
|
Classification Report: |
|
|
precision recall f1-score support |
|
|
|
|
|
Child 0-12 0.9763 0.9758 0.9761 2193 |
|
|
Teenager 13-20 0.9158 0.8437 0.8783 1779 |
|
|
Adult 21-44 0.9593 0.9779 0.9685 9999 |
|
|
Middle Age 45-64 0.9458 0.9450 0.9454 3785 |
|
|
Aged 65+ 0.9769 0.9381 0.9571 1260 |
|
|
|
|
|
accuracy 0.9559 19016 |
|
|
macro avg 0.9548 0.9361 0.9451 19016 |
|
|
weighted avg 0.9557 0.9559 0.9556 19016 |
|
|
``` |
|
|
|
|
|
 |
|
|
|
|
|
--- |
|
|
|
|
|
The model categorizes images into five age ranges: |
|
|
|
|
|
* **Class 0:** "Child 0-12" |
|
|
* **Class 1:** "Teenager 13-20" |
|
|
* **Class 2:** "Adult 21-44" |
|
|
* **Class 3:** "Middle Age 45-64" |
|
|
* **Class 4:** "Aged 65+" |
|
|
|
|
|
--- |
|
|
|
|
|
# **Run with Transformers** |
|
|
|
|
|
```python |
|
|
!pip install -q transformers torch pillow gradio |
|
|
``` |
|
|
|
|
|
```python |
|
|
import gradio as gr |
|
|
import torch |
|
|
from transformers import AutoImageProcessor, AutoModelForImageClassification |
|
|
from PIL import Image |
|
|
|
|
|
# Model name from Hugging Face Hub |
|
|
model_name = "prithivMLmods/MetaCLIP-2-Age-Range-Estimator" |
|
|
|
|
|
# Load processor and model |
|
|
processor = AutoImageProcessor.from_pretrained(model_name) |
|
|
model = AutoModelForImageClassification.from_pretrained(model_name) |
|
|
model.eval() |
|
|
|
|
|
# Define labels |
|
|
LABELS = { |
|
|
0: "Child (0–12)", |
|
|
1: "Teenager (13–20)", |
|
|
2: "Adult (21–44)", |
|
|
3: "Middle Age (45–64)", |
|
|
4: "Aged (65+)" |
|
|
} |
|
|
|
|
|
def age_classification(image): |
|
|
"""Predict the age group of a person from an image.""" |
|
|
image = Image.fromarray(image).convert("RGB") |
|
|
inputs = processor(images=image, return_tensors="pt") |
|
|
|
|
|
with torch.no_grad(): |
|
|
outputs = model(**inputs) |
|
|
logits = outputs.logits |
|
|
probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist() |
|
|
|
|
|
predictions = {LABELS[i]: round(probs[i], 3) for i in range(len(probs))} |
|
|
return predictions |
|
|
|
|
|
# Build Gradio interface |
|
|
iface = gr.Interface( |
|
|
fn=age_classification, |
|
|
inputs=gr.Image(type="numpy", label="Upload Image"), |
|
|
outputs=gr.Label(label="Predicted Age Group Probabilities"), |
|
|
title="MetaCLIP-2 Age Range Estimator", |
|
|
description="Upload a face image to estimate the person's age group using MetaCLIP-2." |
|
|
) |
|
|
|
|
|
# Launch app |
|
|
if __name__ == "__main__": |
|
|
iface.launch() |
|
|
``` |
|
|
|
|
|
# **Sample Inference:** |
|
|
|
|
|
 |
|
|
 |
|
|
 |
|
|
 |
|
|
 |
|
|
|
|
|
# **Intended Use:** |
|
|
|
|
|
The **MetaCLIP-2-Age-Range-Estimator** model is designed to classify images into five age categories. |
|
|
Potential use cases include: |
|
|
|
|
|
* **Demographic Analysis:** Supporting research and business insights into age distribution. |
|
|
* **Health and Fitness Applications:** Assisting in age-based health recommendations. |
|
|
* **Security and Access Control:** Enabling age verification systems. |
|
|
* **Retail and Marketing:** Enhancing personalization and customer profiling. |
|
|
* **Forensics and Surveillance:** Supporting age estimation in investigative and security contexts. |