Spaces:

manmeet3591
/

AFDBench

Runtime error

AFDBench / README.md

Upload folder using huggingface_hub

bd00fe6 verified about 2 months ago

683 Bytes

	---
	title: AFDBench
	emoji: 🌦
	colorFrom: blue
	colorTo: green
	sdk: gradio
	sdk_version: 4.19.2
	app_file: app.py
	pinned: true
	license: apache-2.0
	python_version: 3.11
	short_description: The Weather Forecast Discussion Alignment Benchmark
	---

	# AFDBench: Area Forecast Discussion Benchmark

	AFDBench evaluates how well AI models generate professional meteorological text compared to Human NWS Forecasters.

	### Core Metrics:
	1. Met-Align: Physical accuracy vs. Human numerical choices.
	2. Style-Align: Linguistic alignment with NWS AFD professional prose.

	Initial results on 7,734 human samples reveal a massive Meteorological Hallucination Gap in zero-shot open models.