Spaces:
Configuration error
Configuration error
| # UTAustin-AIHealth | |
| Welcome to **UTAustin-AIHealth** – a hub dedicated to advancing research in medical AI. | |
| This repo contains the **MedHallu** dataset, which underpins our recent work: | |
| **MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models** | |
| MedHallu is a rigorously designed benchmark intended to evaluate large language models' ability to detect hallucinations in medical question-answering tasks. | |
| The dataset is organized into two distinct splits: | |
| - **pqa_labeled:** Contains 1,000 high-quality, human-annotated samples derived from PubMedQA. | |
| - **pqa_artificial:** Contains 9,000 samples generated via an automated pipeline from PubMedQA. | |
| --- | |
| ## Setup Environment | |
| To work with the MedHallu dataset, please install the Hugging Face `datasets` library using pip: | |
| ```bash | |
| pip install datasets | |
| ``` | |
| ## How to Use MedHallu | |
| **Downloading the Dataset:** | |
| ```python | |
| from datasets import load_dataset | |
| # Load the 'pqa_labeled' split: 1,000 high-quality, human-annotated samples. | |
| medhallu_labeled = load_dataset("UTAustin-AIHealth/MedHallu", "pqa_labeled") | |
| # Load the 'pqa_artificial' split: 9,000 samples generated via an automated pipeline. | |
| medhallu_artificial = load_dataset("UTAustin-AIHealth/MedHallu", "pqa_artificial") | |
| ``` | |
| --- | |
| ## License | |
| This dataset and associated resources are distributed under the [MIT License](https://opensource.org/license/mit/). | |
| ## Citations | |
| If you find MedHallu useful in your research, please consider citing our work: | |
| ```bibtex | |
| @misc{pandit2025medhallucomprehensivebenchmarkdetecting, | |
| title={MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models}, | |
| author={Shrey Pandit and Jiawei Xu and Junyuan Hong and Zhangyang Wang and Tianlong Chen and Kaidi Xu and Ying Ding}, | |
| year={2025}, | |
| eprint={2502.14302}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CL}, | |
| url={https://arxiv.org/abs/2502.14302}, | |
| } | |
| ``` | |
| ## Contact | |
| For further information or inquiries about MedHallu, please reach out at shreypandit@utexas.edu |