Qlora fine-tune

by NickyNicky - opened Mar 28, 2024

Mar 28, 2024

This model seems great to me but I have a problem with Qlora that I fine tune it and the v-ram shoots up to more than 30GB instead of the 15 GB I normally use with the script you provided.

vikhyatk

Owner Mar 29, 2024

How are you doing the Qlora finetuning? Would appreciate reproduction steps. :)

nicolollo

Apr 2, 2024

How are you doing the Qlora finetuning? Would appreciate reproduction steps. :)

me too XD

rockerBOO

May 1, 2024

•

edited May 1, 2024

Load in 4bit and then the regular PEFT model addition. Needs bitsandbytes package as well.

    nf4_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_use_double_quant=True,
        bnb_4bit_compute_dtype=torch.bfloat16,
    )

 model = AutoModelForCausalLM.from_pretrained(
        model_id,
        trust_remote_code=True,
        revision=revision,
        quantization_config=nf4_config,
    )

target_modules = ["mixer.Wqkv"]
config = LoraConfig(
        r=1,
        lora_alpha=1,
        lora_dropout=0.05,
        bias="none",
        task_type="CAUSAL_LM",
        # target specific modules (look for Linear in the model)
        # print(model) to see the architecture of the model
        target_modules=target_modules,
    )
model = get_peft_model(model, config)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment