Training Setup

by JackalAI - opened May 23, 2023

May 23, 2023

Hi There!

I'm new to a lot of this, and have a very constrained schedule, so I haven't been able to dedicate the time I'd really like to start doing the level of training that interests me. So far, I've only been doing LoRA training on oobabooga, and I've gotten OOM right at the end of every training attempt through that method on 30B+ models.

Is there a particular script or framework you use for training, merging, etc on these larger models that allows for use of multiple GPUs? I figure I'm probably going to need to buckle down and read some books/papers to really understand what's going on, but things are moving so fast it's hard to know where I can best spend my time learning.

Either way, really appreciate the work you've done here!

-J

digitous

Owner May 25, 2023

Thanks for asking and your interest!

Weird as it sounds, I don't train these models or the LoRAs used in them; I use a model merge script and a lora merge script to do the work that goes into these.

https://github.com/Digitous/Enhanced-LM-Mixer
https://github.com/tloen/alpaca-lora

JackalAI

May 29, 2023

Appreciate the answer anyway! Every link I get together gets me a bit closer to working in this ecosystem. Lots of fun to be had, I think!

Take care, and thanks again for the models and the engagement!

-J

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment