Variants
#2
by someone13574 - opened
Would variants with different calibration data and pruning percents be worth it? Also, instead of taking the top 50%, it might be better to prune based on a percentile of the max expert (ie. everything activated 1/10th of the time of the best is pruned).