AutoPCR: Automated Phenotype Concept Recognition by Prompting
Motivation: Phenotype concept recognition (CR) is a fundamental task in biomedical text mining. However, existing methods either require ontology-specific training, making them struggle to generalize across diverse text styles and evolving biomedical terminology, or depend on general-purpose large language models (LLMs) that lack necessary domain knowledge. Results: To address these limitations, we propose AutoPCR, a prompt-based phenotype CR method designed to automatically generalize to new ontologies and unseen data without ontology-specific training. To further boost performance, we also introduce an optional self-supervised training strategy. Experiments show that AutoPCR achieves the best average and most robust performance across datasets. Further ablation and transfer studies demonstrate its inductive capability and generalizability to new ontologies. Availability and Implementation: Our code is available at https://github.com/yctao7/AutoPCR. Contact: drjieliu@umich.edu
