Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
etemiz 
posted an update 1 day ago
Post
418
I realized when I ask longer answers to my questions, the models sometimes produce completely opposite answer. What could be the reason?

I do mostly CPT. Should I convert my dataset to SFT and give longer reasonings too for it to have integrity?

Example: Is the yolk of an egg more beneficial or the white? Answer in 100 words.

Answer: Yolk is more beneficial because ..........

Example: Is the yolk of an egg more beneficial or the white? Answer in 500 words.

Answer: White is more beneficial because ..........

Edit: These happen in temp = 0.0

Since its first token groups(Yolk|White), that sets whole story(in Your example), obviously problem is too high temperature, if one desires more deterministic outcome - set it lower. And since those (Yolk|White) groups are predicted somewhere at start, it doesn't matter for how long the predicted tokens go(how long is output). Next tokens already attend to them and thus generate some "reasoning" that enforce that, as it was instruction-trained to do.

·

Thanks for the input but these happened all when temp = 0.0

My guess is, since I use mostly datasets generated from voice, the models are one thing when they are talking like a human in day to day life, but completely opposite when they are feeling like a scientist, producing a long text..

I think SFT would help a lot as you suspected.

The way I see it is that it's actually succeeding at what CPT is good at (pattern matching). Meaning, somewhere in the data set there is data that actually favors White over Yolk and somewhere in your data Yolk is being preferred over White. It doesn't even have the be that obviously defined, but could be indirect.

So what I think happens is this:

Short question (100 words) ===> Matches pattern from Q&A sites and FAQ sections (just as example) ===> This data mentions yolk wins

Long question (500 words) ===> Matches pattern from blog posts and academic articles (also just examples) ===> This data mentions whites wins

So besides cleaning up the data, which is really kind of out of scope because you'd be babysitting your data for every possible length/answer. I think SFT will help.

With SFT it doesn't just learn the patterns but what humans prefer, which is consistency across length. It's basically statistical correlation with CPT vs behavioral alignment with SFT.

There's also a thing called attention drift that you may want to look into, it can be helpful.