Preference data bootstrapping (cheap but effective)
#36
by
jbakerx - opened
Generate few candidates per prompt with different decoding settings, then auto-score candidates with:
repetition penalty metrics
“modern slang” detectors
language ID + Cyrillic cleanliness
Keep top candidates and have humans rank a smaller subset.
We will consider this enhancement for inclusion in version 2.0.0.