This isn’t actually doing self-supervised curriculum learning.

#2
by SuperSonnix71 - opened

What the model is doing is estimating how difficult a sequence is using its own perplexity, and then using that signal to decide how many recursion steps to run. Which isn't self-supervised curriculum learning.

So it’s basically adjusting the amount of compute based on difficulty. I’d call that adaptive compute, not self-supervised curriculum learning. In a true self-supervised curriculum, the training progression itself changes. For example the model gradually moves from easier samples to harder ones over time. That isn’t happening here. 😉

you're completely right, my bad on the terminology.
it's adaptive compute — using the model's own perplexity to allocate recursion depth per input, not curriculum learning in the training progression sense.Thanks for catching that 🙏

Sign up or log in to comment