Prompting of different code languages?

#40

by TAO12138 - opened May 25, 2023

May 25, 2023

Taking quick_sort as an example, what is the prompting of different code languages during the inference.

For python, starcoder can directly generate expected output using # quick sort. But, it went wrong when using other languages. How we can recognize the requirements from different code languages.

TAO12138

May 25, 2023

Is the prompting like // language: c++\n and # language: Python\n. In this condition, the test results is satisfied.

loubnabnl

BigCode org May 25, 2023

We did condition on filename during pretraining so you can try appending: <filename>file_path.ext\n where ext is the extension of the language you want to generate the code in, you can change the filepath as you want.
For example we found <filename>solutions/solution_1.py\n# Here is the correct implementation of the code exercise\n) to help with solving HumanEval problems in Python.

daanturo

May 26, 2023

We did condition on filename during pretraining so you can try appending: <filename>file_path.ext\n where ext is the extension of the language you want to generate the code in, you can change the filepath as you want.
For example we found <filename>solutions/solution_1.py\n# Here is the correct implementation of the code exercise\n) to help with solving HumanEval problems in Python.

Thank you! But in FIM mode, should I add that before or after <fim_prefix>?
Like

<filename>solutions/solution_1.py
<fim_prefix>...<fim_suffix>...

<fim_prefix><filename>solutions/solution_1.py
...<fim_suffix>...

loubnabnl

BigCode org May 26, 2023

This should be part of the prefix so the second option is the correct one, you can check this code that we use for FIM evaluation.

loubnabnl changed discussion status to closed Jun 6, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment