Finetune on longer sequences

by joelniklaus - opened Apr 26, 2023

Apr 26, 2023

Hi guys,
Great model suite, thanks a lot!
I am interested in finetuning this model on longer sequences (4096 to 8192). What would you say is the easiest way to do this?
Cheers,
Joel

Dihf

May 5, 2023

I have the same question

nudelbrot

Jun 13, 2023

it's weird, but on https://moon-ci-docs.huggingface.co/docs/transformers/pr_22810/en/model_doc/gpt_neox_alibi (and only there) there's a GPT-NeoX alibi, might be a pull request that never got merged.

Pythia should be the same family as NeoX, happy to look at other solutions but basically looking for the same thing.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment