Curriculum learning by length

#38
by jbakerx - opened

Start training with shorter sequences (512/1024) then move to 2048.
This often stabilizes training and improves coherence, especially CPU-only

jbakerx changed discussion title from Better segmentation: train on scenes, not entire books to Curriculum learning by length

We will consider this enhancement for inclusion in version 2.0.0.

Sign up or log in to comment