How did you train / FT this?

by venketh - opened Nov 9, 2023

Fine-tune GPT2 or train from scratch? (AFAICT it's challenging to fine-tune w/ different n_ctx than a base model)
What trainer?
I'm seeing the weights tensor have a dim of [2048,768] rather than the expected [4096,768] for "gpt2-4k"; is this a -2k context gpt2?
Should n_ctx/n_positions in config.json be updated?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment