Looking for a formula to help with "Custom Aggregation" (i.e. "Writing" score w/custom "Reading Grade")
Hey there! This might be in the realm of "proprietary stuff DPTE wouldn't want to share," but I'm wondering if you'd be willing to share the specific formula or algorithm you use to calculate (specifically) the writing score; I like the results of it overall, but I'm curious to see how different models might fare if I was ranking based on e.g. a reading grade level of 8, vs 5.5 for example.
Given it seems like the raw results are already presented, I think this probably wouldn't be useful for people hoping to game the benchmarks, but as Writing is one of the Top Categories™️, I could be mistaken there.
(I mention the writing score, but I'd be keen to see any of the other algorithms that generate these aggregate scores as well.)
I'd be happy to sign any sort of "please don't share" agreement if you wanted to release it privately, but I certainly have no expectations there - were I in your shoes, I'd only share this with a Rando Online Person if I was willing to share it with the world.
(I do think it could be fun to put together an app that lets one toy with these "preference parameters," but that's something I'd only do with your direct and explicit blessing, of course.)
https://huggingface.co/spaces/VOIDER/UGI-Leaderboard-Presets
This fork has a "🎨 Custom Weights" menu, i think it's the stuff you are proposing, right?
Yeah I state in parenthesis on the headers the optimal values the writing equation uses, but I'd rather not fully post the exact formula weights or how the writing metrics are calculated.
I'm not super wanting to try to fit weight/target sliders for every writing score parameter onto the leaderboard.