AFDBench / README.md
manmeet3591's picture
Upload folder using huggingface_hub
bd00fe6 verified
---
title: AFDBench
emoji: 🌦
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.19.2
app_file: app.py
pinned: true
license: apache-2.0
python_version: 3.11
short_description: The Weather Forecast Discussion Alignment Benchmark
---
# AFDBench: Area Forecast Discussion Benchmark
AFDBench evaluates how well AI models generate professional meteorological text compared to Human NWS Forecasters.
### Core Metrics:
1. **Met-Align**: Physical accuracy vs. Human numerical choices.
2. **Style-Align**: Linguistic alignment with NWS AFD professional prose.
Initial results on 7,734 human samples reveal a massive **Meteorological Hallucination Gap** in zero-shot open models.