davmel commited on
Commit
f816d7b
·
verified ·
1 Parent(s): 2a47179

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -0
README.md CHANGED
@@ -1,3 +1,10 @@
 
 
 
 
 
 
 
1
  # Active UltraFeedback
2
 
3
  **Active UltraFeedback** is a scalable pipeline for generating high-quality preference datasets to align large language models (LLMs). We leverage **uncertainty quantification** and **active learning** to annotate only the most informative samples, drastically reducing costs while beating standard baselines.
 
1
+ ---
2
+ license: mit
3
+ title: ActiveUltraFeedback
4
+ short_description: Sample-Efficient RLHF Preference data generation
5
+ papers: 2603.09692
6
+ ---
7
+
8
  # Active UltraFeedback
9
 
10
  **Active UltraFeedback** is a scalable pipeline for generating high-quality preference datasets to align large language models (LLMs). We leverage **uncertainty quantification** and **active learning** to annotate only the most informative samples, drastically reducing costs while beating standard baselines.