DWD Clean URL Architecture & SEO System
This document describes the path-based URL system implemented for the DWD section of Climate Explorer. It serves as a reference template for implementing clean URLs on other sections of the site.
URL Structure
/dwd/{resolution}/{state?}/{station?}/?{view}&{start}&{end}
Path Segments
| Segment | Required | Example | Description |
|---|---|---|---|
resolution |
Yes (defaults to daily) |
hourly |
Time resolution slug |
state |
No | bayern |
German state (Bundesland) slug |
station |
No | muenchen-flughafen |
Station name slug |
Query Parameters (UI state only β not indexed)
| Param | Default | Example | Description |
|---|---|---|---|
view |
map |
dashboard-plots |
Active tab |
start |
Resolution default | 2020-01-01 |
Date range start |
end |
Resolution default | 2026-04-26 |
Date range end |
URL Examples
# Base landing page (defaults to Daily)
/dwd/
# Resolution pages
/dwd/daily/
/dwd/hourly/
/dwd/10-minutes/
/dwd/monthly/
/dwd/annual/
# State pages
/dwd/daily/bayern/
/dwd/hourly/sachsen/
/dwd/10-minutes/nordrhein-westfalen/
# Station pages
/dwd/daily/bayern/muenchen-flughafen/
/dwd/hourly/sachsen/leipzig-holzhausen/
# With UI state (query params)
/dwd/daily/bayern/muenchen-flughafen/?view=dashboard-plots&start=2020-01-01&end=2026-04-26
Resolution Slugs
| UI Label | URL Slug | Shiny Internal Value |
|---|---|---|
| 10 Minutes | 10-minutes |
10_minutes |
| Hourly | hourly |
hourly |
| Daily | daily |
daily |
| Monthly | monthly |
monthly |
| Annual | annual |
annual |
Slugify Algorithm
State and station names are slugified using the same algorithm across all three layers (R, JS, Edge Function):
1. Replace German umlauts: ΓΌβue, ΓΆβoe, Γ€βae, Γβue, Γβoe, Γβae, Γβss
2. Lowercase
3. Strip diacritics (R uses iconv ASCII//TRANSLIT; JS/TS use NFD + regex)
4. Replace non-alphanumeric chars with hyphens
5. Trim leading/trailing hyphens
Examples:
MΓΌnchen-Flughafenβmuenchen-flughafenNordrhein-Westfalenβnordrhein-westfalenThΓΌringenβthueringenBaden-WΓΌrttembergβbaden-wuerttemberg
Critical: The slugify function must produce identical output in R (
scripts/export_seo_metadata.R), JavaScript (dwd-page.js), and TypeScript (rewrite-meta.ts). Any mismatch causes 404s or broken links. Note that R usesiconv(..., to = "ASCII//TRANSLIT")while JS/TS useNFD normalize + strip combining marksβ both produce the same result for German text.
System Architecture
The URL system spans four layers:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 1. SEO Metadata (Build Time) β
β R script β dwd-seo-metadata.json β
β Generates slugβmetadata mappings for all β
β stations, states, and resolutions β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β 2. Edge Function (Request Time) β
β rewrite-meta.ts β
β Parses URL β injects HTML body content, β
β meta tags, JSON-LD, canonical URL β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β 3. Parent Page JS (Client Side) β
β dwd-page.js β
β Parses URL β configures iframe, β
β listens to Shiny broadcasts β updates URL, β
β title, and dynamic context block β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β 4. Shiny App (Iframe) β
β server.R β
β Receives URL params β broadcasts state β
β changes via postMessage to parent page β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1. SEO Metadata Generation (Build Time)
Script: scripts/export_seo_metadata.R (in the DWD project)
Output: dwd-seo-metadata.json (in climateexplorer/netlify/edge-functions/)
This R script reads all 5 resolution RDS cache files and generates a JSON file containing:
- Stations:
{resolution}/{state-slug}/{station-slug}β{id, name, state, stateSlug, elevation, lat, lon, resolution, resolutionLabel, resolutionSlug, overallStart, overallEnd, availableParams} - States:
{resolution}/{state-slug}β{state, stateSlug, resolution, resolutionLabel, resolutionSlug, stationCount, activeStationCount} - Resolutions:
{resolution-slug}β{key, label, slug, stationCount, activeStationCount} - Slug map: display name β slug (for legacy URL redirect lookups)
To regenerate the metadata:
# Run from the DWD app root directory (clima/2025/dwd/)
Rscript scripts/export_seo_metadata.R
# Copy the output to the climateexplorer project (clima/2024/climateexplorer/)
cp dwd-seo-metadata.json ../../2024/climateexplorer/netlify/edge-functions/
2. Edge Function (Request Time)
File: climateexplorer/netlify/edge-functions/rewrite-meta.ts
When a request hits /dwd/{resolution}/{state?}/{station?}/:
parseDwdPath()extracts path segments- Looks up metadata from
dwd-seo-metadata.json - Injects into the HTML response:
<title>β e.g.,"MΓΌnchen-Flughafen, Bayern β Daily Climate Data | DWD Explorer"<meta name="description">β station-specific description<link rel="canonical">β canonical URL- OG/Twitter meta tags
- JSON-LD breadcrumb β structured data for Google
- Body content (
<div id="dynamic-context">) β rich HTML with station details, state lists, or country overview window.__DWD_RESOLVED__β resolved metadata for the JS layer
This is server-side rendered β Google sees full content without executing JavaScript.
3. Parent Page JavaScript (Client Side)
File: climateexplorer/dwd/dwd-page.js
On page load:
parsePathParams()extracts resolution/state/station from the URL- If
/dwd/(no resolution), defaults to "Daily" - Builds iframe URL with Shiny query params
- Uses
__DWD_RESOLVED__metadata (from edge function) to pass real station IDs/names to iframe
On Shiny state changes (via postMessage):
handleIframeMessage()receives broadcast from iframeupdateBrowserUrl()updates the browser URL (usinghistory.replaceState)updatePageTitle()updates the browser tab titleupdateDynamicContext()updates the context block HTML
4. Shiny App Broadcasts (Iframe)
File: server.R (in the DWD project)
The broadcast_state() function sends a postMessage to the parent page with:
list(
station = station_id,
stationName = station_name,
landname = state_name, # German state
resolution = resolution, # UI label (e.g., "Daily")
view = active_view,
start = start_date,
end = end_date,
countryStationCount = ..., # Total stations for this resolution
countryActiveCount = ..., # Active in current date range
countryStateList = ..., # State breakdown for context block
...
)
Broadcast triggers (observers in server.R):
- Tab/view changes
- Station selection changes
- Station deselection
- Resolution changes
- Date range changes
- State filter changes (
ignoreNULL = FALSEβ fires on clear)
Sitemap Integration
indexed-pages.json
File: climateexplorer/netlify/edge-functions/indexed-pages.json
Defines the curated URLs to include in sitemap.xml:
{
"/dwd": {
"stations": [
{ "path": "daily/bayern/muenchen-flughafen" },
{ "path": "daily/sachsen/leipzig-holzhausen" }
],
"regions": [
{ "path": "daily/bayern" },
{ "path": "daily/berlin" }
],
"resolutions": [
{ "path": "daily" },
{ "path": "hourly" },
{ "path": "10-minutes" }
]
}
}
Sitemap Normalization
Script: climateexplorer/scripts/normalize-sitemap.mjs
Runs after quarto render to inject curated URLs into sitemap.xml. The validator (scripts/lib/indexed-pages-validator.mjs) auto-approves path-based entries (entries with a path field).
Google Discovery Chain
The sitemap contains ~36 DWD seed URLs. Google discovers all other pages through internal links:
Sitemap: 5 resolution pages βββ Each links to 16 states
β
ββββββββββββββββββββββ
βΌ
16 state pages βββ Each links to all stations in that state
β
ββββββββββββββββββββββ
βΌ
~1400 station pages (per resolution)
Total discoverable pages: ~5,000+ across all resolutions.
Legacy URL Redirect
Old query-parameter URLs are automatically redirected to clean paths:
/dwd/?resolution=Daily&landname=Bayern&station=MΓΌnchen-Flughafen
β 301 redirect β
/dwd/daily/bayern/muenchen-flughafen/
Handled by the edge function's "DWD Legacy Query-Param Redirect" block.
Applying to Other Sections
To implement this pattern for another section (e.g., /meteofrance/, /jma/):
1. Define the URL hierarchy
/{section}/{resolution}/{region?}/{station?}/
Choose meaningful slugs for resolutions, regions (departments, prefectures, countries), and stations.
2. Create SEO metadata
Write an R script to generate {section}-seo-metadata.json with:
- Station metadata (name, region, coordinates, data range, parameters)
- Region metadata (station counts)
- Resolution metadata (station counts)
- Slug map (display name β URL slug)
3. Update the edge function
Add a parse{Section}Path() function and inject body content + meta tags.
4. Create the page JavaScript
Write a {section}-page.js that:
- Parses path segments on load
- Configures the iframe with Shiny query params
- Listens for postMessage broadcasts and updates URL/title/context
5. Update Shiny's broadcast_state()
Ensure the Shiny app sends state/region/station names in its broadcasts so the JS can construct correct URLs.
6. Update indexed-pages.json
Add curated seed URLs for the section's resolutions, regions, and sample stations.
7. Verify
# Run sitemap normalization
QUARTO_PROJECT_OUTPUT_DIR=_site node scripts/normalize-sitemap.mjs
# Run sitemap checks
QUARTO_PROJECT_OUTPUT_DIR=_site node scripts/check-sitemap.mjs --fetch
# Test edge function locally
netlify dev
Key Files Reference
| File | Location | Purpose |
|---|---|---|
export_seo_metadata.R |
dwd/scripts/ |
Generate SEO metadata JSON |
dwd-seo-metadata.json |
climateexplorer/netlify/edge-functions/ |
Station/state/resolution metadata |
rewrite-meta.ts |
climateexplorer/netlify/edge-functions/ |
Edge function (SSR injection) |
dwd-page.js |
climateexplorer/dwd/ |
Client-side URL sync |
server.R |
dwd/ |
Shiny broadcast_state() |
indexed-pages.json |
climateexplorer/netlify/edge-functions/ |
Sitemap seed URLs |
normalize-sitemap.mjs |
climateexplorer/scripts/ |
Sitemap URL injection |
indexed-pages-validator.mjs |
climateexplorer/scripts/lib/ |
Validates curated URLs |