Question Regarding Fine-tuning Format Used in ThinkGuard (LLaMA Factory?)

#1
by WANNTING - opened

I recently read your paper “THINKGUARD: Deliberative Slow Thinking Leads to Cautious Guardrails” and found your approach to safety alignment through critique-augmented fine-tuning both innovative and inspiring.

I have a quick technical question:
Did you use LLaMA Factory for fine-tuning your LLaMA Guard 3-8B model? If so, could you kindly share the data format you adopted for training—was it closer to the Alpaca-style (instruction/input/output) or something else?

I’m currently experimenting with a similar setup and would greatly appreciate any insight you could provide.

Thanks again for your valuable contribution to the field, and I look forward to your response!

Best regards,
Wendy

Hi Wendy,

Thank you for your interest!

Yes, we used LLaMA Factory for fine-tuning the LLaMA Guard 3-8B model. The data format we adopted follows the Alpaca-style (instruction/input/output) structure, which works well with LLaMA Factory's SFT pipeline. In the format of:

{
"instruction": ,
"input": ,
"output":
},

Best,
Xiaofei

Sign up or log in to comment