GRACE: Discriminator-Guided Chain-of-Thought Reasoning

This model is part of the work presented in the paper GRACE: Discriminator-Guided Chain-of-Thought Reasoning.

GRACE (Guiding chain-of-thought ReAsoning with a CorrectnEss Discriminator) is a stepwise decoding approach that steers the decoding process towards producing correct reasoning steps. It employs a step-level verifier or discriminator trained with a contrastive loss over correct and incorrect steps, which is used during decoding to score next-step candidates based on their correctness.

Resources

Sample Usage

The official implementation for running guided decoding using this model can be found in the GitHub repository. Below is an example of how to run the GRACE decoding:

WANDB_MODE=disabled python run_grace.py \
                        --model_name_or_path mkhalifa/flan-t5-large-gsm8k \
                        --in_file data/gsm8k/dev.jsonl \
                        --task gsm8k \
                        --disc_path ckpts/discrim/flan-t5-gsm8k/ \
                        --beta 0.1 --n_candidate_steps 20 --generation_type step-score \
                        --step_sampling_method top_p --device2 cuda:0 --top_p .95 --sample_calc true \
                        --max_steps 6  --max_step_length 60 --step_delimiter '|' --temperature .8  --n_self_consistency 1 --seed 42

Citation

If you use this work, please cite the following paper:

@article{khalifa2023grace,
  title={Grace: Discriminator-guided chain-of-thought reasoning},
  author={Khalifa, Muhammad and Logeswaran, Lajanugen and Lee, Moontae and Lee, Honglak and Wang, Lu},
  journal={arXiv preprint arXiv:2305.14934},
  year={2023}
}
Downloads last month
22
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for mkhalifa/flan-t5-large-svamp