AquaTwin · STGCN v2

Digital twin for monthly per-capita water consumption across the 53 neighborhoods (barrios) of Alicante (Spain).
Developed by Equipo AGUARDIENTE for the AMAEM Hackathon.


Model description

AquaTwin uses a Spatio-Temporal Graph Convolutional Network (STGCN) to forecast monthly per-capita water consumption (consumo_m3_per_capita) at neighborhood level.
The spatial graph encodes geographic adjacency between barrios (53 nodes, 126 edges) derived from the official Alicante GeoJSON. Temporal dependencies are captured via Chebyshev graph convolutions (K=3) combined with multi-head self-attention (2 heads) over a 12-month sliding window.

Architecture

Hyperparameter Value
Input window 12 months
Nodes 53 barrios
Node features 16
Hidden size 64
Chebyshev order K 3
Attention heads 2
Dropout 0.30
Node drop (training) 0.05
Epochs trained 563

Input features (16)

Feature Description
consumo_m3_per_capita Target (lag input)
consumo_por_contrato m³ per active contract
temp_media_c Monthly mean temperature (AEMET)
temp_max_c Monthly max temperature (AEMET)
precip_mm Monthly precipitation (AEMET)
etp_mm Evapotranspiration — Thornthwaite
pernoctaciones Hotel overnight stays (INE tabla 2074)
ipc_idx CPI index (INE IPC251852)
mes_sin / mes_cos Cyclical month encoding
n_festivos Public holidays in month
hogueras_flag Hogueras de San Juan festival
semana_santa_flag Holy Week flag
ratio_contratos Contract ratio vs. city mean
pct_dom % domestic contracts
covid_flag COVID-19 disruption period

Training data

  • Real data: 36 months (AMAEM open data portal, 53 barrios)
  • Synthetic data: 36 months generated with a CVAE (Conditional Variational Autoencoder v2), achieving seasonal correlation r = 0.921 with real data
  • Total: 72 months → 42 training sequences (window=12, step=1)

The CVAE was necessary because STGCN requires sufficient sequence diversity to generalise; 21 sequences from real data alone were insufficient.


Evaluation results

Evaluated on a held-out test set of real data:

Metric Value
0.953
MAE 0.902 m³/person
RMSE 2.212 m³/person
MAPE 9.0 %

Honest limitations

The model outperforms a lag-12 naive baseline globally (39.7% RMSE reduction), but underperforms it in approximately 29/53 barrios with stable, low-variance consumption. For those barrios, the simple baseline is more accurate. This trade-off is documented explicitly in the project notebook.


Repository files

File Description
stgcn_v2_weights.pt PyTorch model weights (state dict)
stgcn_v2_config.json Full architecture config & evaluation metrics
stgcn_v2_target_mean.npy Per-node target mean (for denormalization)
stgcn_v2_target_std.npy Per-node target std (for denormalization)
adjacency_matrix.csv 53×53 geographic adjacency matrix

Usage

import torch, json
import numpy as np

# Load config
with open("stgcn_v2_config.json") as f:
    cfg = json.load(f)

# Load normalisation
target_mean = np.load("stgcn_v2_target_mean.npy")  # shape (53,)
target_std  = np.load("stgcn_v2_target_std.npy")   # shape (53,)

# Load model (define STGCNv2 class matching config first)
model = STGCNv2(
    n_nodes=cfg["n_nodes"],
    n_features=cfg["n_features"],
    hidden=cfg["hidden"],
    cheb_k=cfg["cheb_k"],
    n_heads=cfg["n_heads"],
    dropout=cfg["dropout"],
)
model.load_state_dict(torch.load("stgcn_v2_weights.pt", map_location="cpu"))
model.eval()

# Input: x of shape (batch, window=12, n_nodes=53, n_features=16)
# Output: shape (batch, n_nodes=53)  — normalised predictions
# Denormalise: pred_real = pred_norm * target_std + target_mean

Citation

Equipo AGUARDIENTE (2025). AquaTwin: Digital Twin for Urban Water Consumption
using Spatio-Temporal Graph Convolutional Networks.
AMAEM Hackathon, Alicante, Spain.

Data sources

  • AMAEM – Agencia Municipal Abastecimiento y Evacuación de Aguas de Alicante (open data)
  • AEMET – Agencia Estatal de Meteorología
  • INE – Instituto Nacional de Estadística (hotel stays, CPI, population)
  • Ayuntamiento de Alicante – Official GeoJSON barrio boundaries
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support