AquaTwin · STGCN v2

Digital twin for monthly per-capita water consumption across the 53 neighborhoods (barrios) of Alicante (Spain).
Developed by Equipo AGUARDIENTE for the AMAEM Hackathon.

Model description

AquaTwin uses a Spatio-Temporal Graph Convolutional Network (STGCN) to forecast monthly per-capita water consumption (consumo_m3_per_capita) at neighborhood level.
The spatial graph encodes geographic adjacency between barrios (53 nodes, 126 edges) derived from the official Alicante GeoJSON. Temporal dependencies are captured via Chebyshev graph convolutions (K=3) combined with multi-head self-attention (2 heads) over a 12-month sliding window.

Architecture

Hyperparameter	Value
Input window	12 months
Nodes	53 barrios
Node features	16
Hidden size	64
Chebyshev order K	3
Attention heads	2
Dropout	0.30
Node drop (training)	0.05
Epochs trained	563

Input features (16)

Feature	Description
`consumo_m3_per_capita`	Target (lag input)
`consumo_por_contrato`	m³ per active contract
`temp_media_c`	Monthly mean temperature (AEMET)
`temp_max_c`	Monthly max temperature (AEMET)
`precip_mm`	Monthly precipitation (AEMET)
`etp_mm`	Evapotranspiration — Thornthwaite
`pernoctaciones`	Hotel overnight stays (INE tabla 2074)
`ipc_idx`	CPI index (INE IPC251852)
`mes_sin` / `mes_cos`	Cyclical month encoding
`n_festivos`	Public holidays in month
`hogueras_flag`	Hogueras de San Juan festival
`semana_santa_flag`	Holy Week flag
`ratio_contratos`	Contract ratio vs. city mean
`pct_dom`	% domestic contracts
`covid_flag`	COVID-19 disruption period

Training data

Real data: 36 months (AMAEM open data portal, 53 barrios)
Synthetic data: 36 months generated with a CVAE (Conditional Variational Autoencoder v2), achieving seasonal correlation r = 0.921 with real data
Total: 72 months → 42 training sequences (window=12, step=1)

The CVAE was necessary because STGCN requires sufficient sequence diversity to generalise; 21 sequences from real data alone were insufficient.

Evaluation results

Evaluated on a held-out test set of real data:

Metric	Value
R²	0.953
MAE	0.902 m³/person
RMSE	2.212 m³/person
MAPE	9.0 %

Honest limitations

The model outperforms a lag-12 naive baseline globally (39.7% RMSE reduction), but underperforms it in approximately 29/53 barrios with stable, low-variance consumption. For those barrios, the simple baseline is more accurate. This trade-off is documented explicitly in the project notebook.

Repository files

File	Description
`stgcn_v2_weights.pt`	PyTorch model weights (state dict)
`stgcn_v2_config.json`	Full architecture config & evaluation metrics
`stgcn_v2_target_mean.npy`	Per-node target mean (for denormalization)
`stgcn_v2_target_std.npy`	Per-node target std (for denormalization)
`adjacency_matrix.csv`	53×53 geographic adjacency matrix

Usage

import torch, json
import numpy as np

# Load config
with open("stgcn_v2_config.json") as f:
    cfg = json.load(f)

# Load normalisation
target_mean = np.load("stgcn_v2_target_mean.npy")  # shape (53,)
target_std  = np.load("stgcn_v2_target_std.npy")   # shape (53,)

# Load model (define STGCNv2 class matching config first)
model = STGCNv2(
    n_nodes=cfg["n_nodes"],
    n_features=cfg["n_features"],
    hidden=cfg["hidden"],
    cheb_k=cfg["cheb_k"],
    n_heads=cfg["n_heads"],
    dropout=cfg["dropout"],
)
model.load_state_dict(torch.load("stgcn_v2_weights.pt", map_location="cpu"))
model.eval()

# Input: x of shape (batch, window=12, n_nodes=53, n_features=16)
# Output: shape (batch, n_nodes=53)  — normalised predictions
# Denormalise: pred_real = pred_norm * target_std + target_mean

Citation

Equipo AGUARDIENTE (2025). AquaTwin: Digital Twin for Urban Water Consumption
using Spatio-Temporal Graph Convolutional Networks.
AMAEM Hackathon, Alicante, Spain.

Data sources

AMAEM – Agencia Municipal Abastecimiento y Evacuación de Aguas de Alicante (open data)
AEMET – Agencia Estatal de Meteorología
INE – Instituto Nacional de Estadística (hotel stays, CPI, population)
Ayuntamiento de Alicante – Official GeoJSON barrio boundaries

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Time Series Forecasting

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support