metadata
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
- reasoning
- chain-of-thought
- math
GRACE: Discriminator-Guided Chain-of-Thought Reasoning
This model is part of the work presented in the paper GRACE: Discriminator-Guided Chain-of-Thought Reasoning.
GRACE (Guiding chain-of-thought ReAsoning with a CorrectnEss Discriminator) is a stepwise decoding approach that steers the decoding process towards producing correct reasoning steps. It employs a step-level verifier or discriminator trained with a contrastive loss over correct and incorrect steps, which is used during decoding to score next-step candidates based on their correctness.
Resources
- Paper: GRACE: Discriminator-Guided Chain-of-Thought Reasoning
- GitHub Repository: https://github.com/mukhal/grace
- Authors: Muhammad Khalifa, Lajanugen Logeswaran, Moontae Lee, Honglak Lee, Lu Wang
Sample Usage
The official implementation for running guided decoding using this model can be found in the GitHub repository. Below is an example of how to run the GRACE decoding:
WANDB_MODE=disabled python run_grace.py \
--model_name_or_path mkhalifa/flan-t5-large-gsm8k \
--in_file data/gsm8k/dev.jsonl \
--task gsm8k \
--disc_path ckpts/discrim/flan-t5-gsm8k/ \
--beta 0.1 --n_candidate_steps 20 --generation_type step-score \
--step_sampling_method top_p --device2 cuda:0 --top_p .95 --sample_calc true \
--max_steps 6 --max_step_length 60 --step_delimiter '|' --temperature .8 --n_self_consistency 1 --seed 42
Citation
If you use this work, please cite the following paper:
@article{khalifa2023grace,
title={Grace: Discriminator-guided chain-of-thought reasoning},
author={Khalifa, Muhammad and Logeswaran, Lajanugen and Lee, Moontae and Lee, Honglak and Wang, Lu},
journal={arXiv preprint arXiv:2305.14934},
year={2023}
}