Question Regarding Fine-tuning Format Used in ThinkGuard (LLaMA Factory?)

by WANNTING - opened Apr 1, 2025

Apr 1, 2025

I recently read your paper “THINKGUARD: Deliberative Slow Thinking Leads to Cautious Guardrails” and found your approach to safety alignment through critique-augmented fine-tuning both innovative and inspiring.

I have a quick technical question:
Did you use LLaMA Factory for fine-tuning your LLaMA Guard 3-8B model? If so, could you kindly share the data format you adopted for training—was it closer to the Alpaca-style (instruction/input/output) or something else?

I’m currently experimenting with a similar setup and would greatly appreciate any insight you could provide.

Thanks again for your valuable contribution to the field, and I look forward to your response!

Best regards,
Wendy

Rakancorle1

Owner Apr 1, 2025

Hi Wendy,

Thank you for your interest!

Yes, we used LLaMA Factory for fine-tuning the LLaMA Guard 3-8B model. The data format we adopted follows the Alpaca-style (instruction/input/output) structure, which works well with LLaMA Factory's SFT pipeline. In the format of:

{
"instruction": ,
"input": ,
"output":
},

Best,
Xiaofei

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment