I am trying to evaluate the test set accuracy of reward model. pls help with it!
Β· Sign up or log in to comment