fine-tuning using FSDP and non 80GB cards?

#43

by besiktas - opened Oct 31, 2023

Oct 31, 2023

Has anyone had any luck fine-tuning this model with non 80GB cards? I am using a training script that is pretty close to the finetuning.py llama script and have tried every combo of options I can think of (but perhaps I'm missing an obvious one) and unable to get a forward/backward pass of the model. Using 8x48GB cards.

Molbap

Nov 7, 2023

Hi @besiktas , there is a discussion around this PR https://github.com/huggingface/transformers/pull/26997 which you may find relevant :)

besiktas

Nov 7, 2023

hey @Molbap thanks, believe I saw that before and was not able to get it to work but I have a working example now.

yinggong

Nov 8, 2023

Can you finetune the fuyu model? I have 80GB cards.

besiktas

Nov 9, 2023

@yinggong you shouldn't have a problem with 80GB cards

yinggong

Nov 21, 2023

@yinggong you shouldn't have a problem with 80GB cards

do you know where to fine the train script?

It seems not work from https://github.com/huggingface/transformers/issues/27255

Thanks!

cysmnl

Nov 21, 2023

I faced a CUDA memory issue when running run_fuyu_no_trainer.py on a single A100 card. I resolved this by using 4-bit models, which happen to work on either a 3090 or a A100 card.

cysmnl

Nov 21, 2023

This comment has been hidden

besiktas

Nov 28, 2023

@yinggong here is a super simple script that allows you to test if you can get the model running on your machine: https://github.com/grahamannett/finetune-fuyu/blob/main/train-simple.py

It mocks a text input and image so that you can be sure the processor works as well.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment