How to finetune models

#56

by yangguofeng - opened Dec 30, 2025

Dec 30, 2025

hello,
I currently have access to approximately 30 million protein sequences and am fine-tuning ProtGPT2 using a LoRA-based approach due to limited computational resources. Given that ProtGPT2 contains 36 transformer blocks, I am wondering whether it is more effective to apply LoRA adapters to all blocks, or to restrict them to a subset of blocks (such as the higher or middle-to-high layers) in order to balance performance and efficiency. I would appreciate any guidance on which blocks tend to be most important for adaptation in this setting while preserving the pretrained protein-level priors.

nferruz

Owner Jan 7

Thanks for reaching out, and apologies for the massive delay!
I haven’t personally fine-tuned ProtGPT2 with LoRA (I’ve only done full models), but from what's out there in the literature, I'd rather do the upper part of the network rather than every block.

Hope that helps!

yangguofeng

Feb 1

Thank You!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment