Could you please explain how the Verifier was trained? The paper seems to only mention that GPT-4 was used as the Verifier.
Good question, but I haven't replied for a long time
· Sign up or log in to comment