--- title: AFDBench emoji: 🌦 colorFrom: blue colorTo: green sdk: gradio sdk_version: 4.19.2 app_file: app.py pinned: true license: apache-2.0 python_version: 3.11 short_description: The Weather Forecast Discussion Alignment Benchmark --- # AFDBench: Area Forecast Discussion Benchmark AFDBench evaluates how well AI models generate professional meteorological text compared to Human NWS Forecasters. ### Core Metrics: 1. **Met-Align**: Physical accuracy vs. Human numerical choices. 2. **Style-Align**: Linguistic alignment with NWS AFD professional prose. Initial results on 7,734 human samples reveal a massive **Meteorological Hallucination Gap** in zero-shot open models.