ActiveUltraFeedback: Sample-Efficient RLHF Preference data generation using Active Learning university • 4 followers