Weather Forecasting Models — Tufts CS137
Deep learning models for 24-hour weather prediction at Tufts University (Jumbo Statue, Medford MA), trained on NOAA HRRR 3 km reanalysis data.
6 architectures trained and compared: CNN Baseline, ResNet-18, ConvNeXt-Tiny, Multi-frame CNN, 3D CNN, and Vision Transformer (ViT).
Models
| Model | File | Params | Architecture | TMP RMSE (K) | Rain AUC |
|---|---|---|---|---|---|
| WeatherViT | vit/best.pt |
7.4M | 6-layer Transformer, 15×15 patches, 900 tokens | 4.06 | 0.776 |
| ResNet-18 | checkpoints/resnet18.pt |
11.2M | Modified torchvision ResNet-18 | 3.54 | 0.768 |
| CNN Baseline | checkpoints/cnn_baseline.pt |
11.3M | 6 ResBlocks, progressive downsample | 4.00 | 0.738 |
Full Test Results (2021)
| Model | TMP (K) | RH (%) | UGRD (m/s) | VGRD (m/s) | GUST (m/s) | APCP>2mm (mm) | AUC |
|---|---|---|---|---|---|---|---|
| ViT | 4.06 | 16.45 | 2.59 | 2.21 | 3.57 | 4.50 | 0.776 |
| ResNet-18 | 3.54 | 15.68 | 2.70 | 2.34 | 3.60 | 4.53 | 0.768 |
| CNN Baseline | 4.00 | 15.89 | 2.56 | 2.23 | 3.58 | 4.56 | 0.738 |
| ConvNeXt-Tiny | 3.66 | 15.85 | 2.54 | 2.17 | 3.65 | 4.55 | 0.692 |
| CNN 3D | 4.76 | 17.44 | 2.61 | 2.32 | 3.58 | 4.75 | 0.668 |
| Multi-frame CNN | 4.55 | 18.41 | 2.62 | 2.45 | 3.62 | 4.76 | 0.652 |
| Persistence | 4.86 | 23.01 | 3.73 | 2.89 | 4.87 | 4.62 | 0.506 |
Key findings:
- ViT achieves the best rain detection AUC (0.776), precipitation RMSE, wind gust, and V-wind
- ResNet-18 leads in temperature (3.54 K) and humidity (15.68%) accuracy
- All models significantly outperform the persistence baseline
Input
- Format: 42-channel spatial grid (450 × 449 pixels)
- Resolution: 3 km (HRRR Lambert Conformal projection)
- Region: US Northeast / New England (~1350 km × 1350 km)
- Channels: Surface variables (temperature, humidity, wind, precipitation, radiation) + atmospheric variables at multiple pressure levels (CAPE, dew point, geopotential height, temperature, U/V wind, cloud cover, moisture)
Output
6 continuous values predicted 24 hours ahead at a single target point:
| Variable | Unit |
|---|---|
| 2m Temperature | K |
| 2m Relative Humidity | % |
| 10m U-Wind | m/s |
| 10m V-Wind | m/s |
| Surface Gust | m/s |
| 1hr Precipitation | mm |
Architecture Highlights
WeatherViT (new)
Input (B,42,450,449) → pad→450×450 → PatchEmbed(15×15, 900 patches)
→ [CLS]+PosEmbed → 6×TransformerBlock(8 heads, dim=256) → CLS → FC → (B,6)
CNN Baseline
Input (B,42,450,449) → Stem(42→64, 7×7, s=2) → 6×ResBlock → GAP → FC → (B,6)
ResNet-18
Input (B,42,450,449) → Modified torchvision ResNet-18 (42-ch input) → FC → (B,6)
Checkpoint Format
{
"model": state_dict, # Model weights
"norm_stats": { # Z-score normalization statistics
"input_mean": (42, 1, 1),
"input_std": (42, 1, 1),
"target_mean": (6,),
"target_std": (6,),
},
"args": {...}, # Training hyperparameters
}
Usage
import torch
from models import create_model
# Load any model (cnn_baseline, resnet18, vit, convnext_tiny, cnn_3d, cnn_multi_frame)
ckpt = torch.load("vit/best.pt", map_location="cpu", weights_only=False)
model = create_model(ckpt["args"]["model"], n_input_channels=42, n_targets=6)
model.load_state_dict(ckpt["model"])
model.eval()
# Inference
x = torch.randn(1, 42, 450, 449) # (batch, channels, height, width)
norm = ckpt["norm_stats"]
x = (x - norm["input_mean"]) / (norm["input_std"] + 1e-7)
with torch.no_grad():
pred = model(x) # (1, 6)
pred = pred * norm["target_std"] + norm["target_mean"] # denormalize
Training Data
HRRR (High-Resolution Rapid Refresh) — NOAA's 3 km hourly weather analysis.
| Split | Period | Samples |
|---|---|---|
| Training | 2018–2019 | ~17,500 |
| Validation | 2020 | ~8,700 |
| Test | 2021 | ~8,700 |
Live Demo
Try the models in real-time with live HRRR data: Tufts Weather Forecast Space
The demo fetches real-time HRRR analysis from NOAA, runs inference, and displays:
- Current input field maps (temperature, precipitation, wind, humidity)
- 24-hour forecast at the Jumbo Statue target point
Links
- GitHub Repository
- Live Demo
- Course: Tufts CS 137 — Deep Neural Networks, Spring 2026
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support