Loading anchors... N=25 anchors === Held-out task: emotion === procrustes ... topk_global_ridge k=5 ... selected: ['CR', 'sst2', 'tweet_sentiment', 'rotten', 'tweet_stance_atheism'] topk_pertensor_ridge k=5 ... topk_global_ridge k=8 ... selected: ['CR', 'sst2', 'tweet_sentiment', 'rotten', 'tweet_stance_atheism', 'subj', 'insincere', 'tweet_stance_climate'] topk_pertensor_ridge k=8 ... topk_global_ridge k=12 ... selected: ['CR', 'sst2', 'tweet_sentiment', 'rotten', 'tweet_stance_atheism', 'subj', 'insincere', 'tweet_stance_climate', 'amazon_pol', 'sst5', 'tweet_irony', 'tweet_stance_feminist'] topk_pertensor_ridge k=12 ... topk12_pertensor_mlp ... [MLP] training on 768 (block,task) pairs of dim 6 ep 0 mse=1.45046 ep 50 mse=0.85978 ep 100 mse=0.72640 ep 150 mse=0.69930 ep 200 mse=0.68228 ep 250 mse=0.66346 ep 299 mse=0.64086 /usr/local/lib/python3.12/site-packages/transformers/generation/configuration_utils.py:590: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.6` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`. warnings.warn( /usr/local/lib/python3.12/site-packages/transformers/generation/configuration_utils.py:595: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.9` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`. warnings.warn( base_Y=0.3367 procrustes cos=0.9830 acc=0.3767 topk5_global_ridge cos=0.9819 acc=0.3767 topk5_pertensor_ridge cos=0.9813 acc=0.3733 topk8_global_ridge cos=0.9830 acc=0.3633 topk8_pertensor_ridge cos=0.9824 acc=0.3667 topk12_global_ridge cos=0.9838 acc=0.3733 topk12_pertensor_ridge cos=0.9831 acc=0.3667 topk12_pertensor_mlp cos=0.9827 acc=0.3600 oracle_Y=0.5467 === Held-out task: tweet_emotion === procrustes ... topk_global_ridge k=5 ... selected: ['tweet_sentiment', 'tweet_stance_atheism', 'tweet_stance_feminist', 'tweet_stance_abortion', 'hate_speech_off'] topk_pertensor_ridge k=5 ... topk_global_ridge k=8 ... selected: ['tweet_sentiment', 'tweet_stance_atheism', 'tweet_stance_feminist', 'tweet_stance_abortion', 'hate_speech_off', 'CR', 'tweet_irony', 'tweet_offensive'] topk_pertensor_ridge k=8 ... topk_global_ridge k=12 ... selected: ['tweet_sentiment', 'tweet_stance_atheism', 'tweet_stance_feminist', 'tweet_stance_abortion', 'hate_speech_off', 'CR', 'tweet_irony', 'tweet_offensive', 'sst2', 'tweet_stance_hillary', 'rotten', 'insincere'] topk_pertensor_ridge k=12 ... topk12_pertensor_mlp ... [MLP] training on 768 (block,task) pairs of dim 6 ep 0 mse=1.51879 ep 50 mse=0.83916 ep 100 mse=0.75799 ep 150 mse=0.74077 ep 200 mse=0.72469 ep 250 mse=0.70488 ep 299 mse=0.68055 base_Y=0.4667 procrustes cos=0.9857 acc=0.2500 topk5_global_ridge cos=0.9855 acc=0.2433 topk5_pertensor_ridge cos=0.9851 acc=0.2400 topk8_global_ridge cos=0.9865 acc=0.2467 topk8_pertensor_ridge cos=0.9860 acc=0.2600 topk12_global_ridge cos=0.9866 acc=0.2600 topk12_pertensor_ridge cos=0.9859 acc=0.2600 topk12_pertensor_mlp cos=0.9853 acc=0.2500 oracle_Y=0.7267 === Held-out task: bbc_news === procrustes ... topk_global_ridge k=5 ... selected: ['ag_news', 'imdb', 'enron_spam', 'tweet_stance_atheism', 'tweet_stance_climate'] topk_pertensor_ridge k=5 ... topk_global_ridge k=8 ... selected: ['ag_news', 'imdb', 'enron_spam', 'tweet_stance_atheism', 'tweet_stance_climate', 'toxic_conv', 'tweet_sentiment', 'subj'] topk_pertensor_ridge k=8 ... topk_global_ridge k=12 ... selected: ['ag_news', 'imdb', 'enron_spam', 'tweet_stance_atheism', 'tweet_stance_climate', 'toxic_conv', 'tweet_sentiment', 'subj', 'tweet_stance_feminist', '20news', 'tweet_stance_abortion', 'rotten'] topk_pertensor_ridge k=12 ... topk12_pertensor_mlp ... [MLP] training on 768 (block,task) pairs of dim 6 ep 0 mse=1.68046 ep 50 mse=0.97068 ep 100 mse=0.80479 ep 150 mse=0.77734 ep 200 mse=0.75944 ep 250 mse=0.74023 ep 299 mse=0.71829 A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. base_Y=0.0633 A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. procrustes cos=0.9754 acc=0.0167 A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. topk5_global_ridge cos=0.9798 acc=0.0133 A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. topk5_pertensor_ridge cos=0.9793 acc=0.0067 A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. topk8_global_ridge cos=0.9799 acc=0.0100 A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. topk8_pertensor_ridge cos=0.9792 acc=0.0067 A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. topk12_global_ridge cos=0.9801 acc=0.0200 A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. topk12_pertensor_ridge cos=0.9794 acc=0.0067 A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. topk12_pertensor_mlp cos=0.9763 acc=0.0200 A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. oracle_Y=0.1033 === Held-out task: ethos_binary === procrustes ... topk_global_ridge k=5 ... selected: ['tweet_stance_atheism', 'tweet_stance_climate', 'tweet_stance_abortion', 'tweet_stance_feminist', 'tweet_sentiment'] topk_pertensor_ridge k=5 ... topk_global_ridge k=8 ... selected: ['tweet_stance_atheism', 'tweet_stance_climate', 'tweet_stance_abortion', 'tweet_stance_feminist', 'tweet_sentiment', 'toxic_conv', 'sst2', 'CR'] topk_pertensor_ridge k=8 ... topk_global_ridge k=12 ... selected: ['tweet_stance_atheism', 'tweet_stance_climate', 'tweet_stance_abortion', 'tweet_stance_feminist', 'tweet_sentiment', 'toxic_conv', 'sst2', 'CR', 'tweet_irony', 'rotten', 'tweet_stance_hillary', 'amazon_cf'] topk_pertensor_ridge k=12 ... topk12_pertensor_mlp ... [MLP] training on 768 (block,task) pairs of dim 6 ep 0 mse=1.33811 ep 50 mse=0.76094 ep 100 mse=0.65318 ep 150 mse=0.62853 ep 200 mse=0.61336 ep 250 mse=0.59739 ep 299 mse=0.57926 Repo card metadata block was not found. Setting CardData to empty. base_Y=0.5033 procrustes cos=0.9902 acc=0.6600 topk5_global_ridge cos=0.9873 acc=0.7567 topk5_pertensor_ridge cos=0.9870 acc=0.7600 topk8_global_ridge cos=0.9897 acc=0.7633 topk8_pertensor_ridge cos=0.9892 acc=0.7500 topk12_global_ridge cos=0.9902 acc=0.7800 topk12_pertensor_ridge cos=0.9896 acc=0.7833 topk12_pertensor_mlp cos=0.9873 acc=0.7000 oracle_Y=0.7033 === Held-out task: trec === procrustes ... topk_global_ridge k=5 ... selected: ['tweet_stance_atheism', 'tweet_stance_climate', 'tweet_stance_abortion', 'tweet_stance_feminist', 'tweet_stance_hillary'] topk_pertensor_ridge k=5 ... topk_global_ridge k=8 ... selected: ['tweet_stance_atheism', 'tweet_stance_climate', 'tweet_stance_abortion', 'tweet_stance_feminist', 'tweet_stance_hillary', 'subj', 'sst5', 'sst2'] topk_pertensor_ridge k=8 ... topk_global_ridge k=12 ... selected: ['tweet_stance_atheism', 'tweet_stance_climate', 'tweet_stance_abortion', 'tweet_stance_feminist', 'tweet_stance_hillary', 'subj', 'sst5', 'sst2', 'rotten', 'CR', 'tweet_sentiment', 'ag_news'] topk_pertensor_ridge k=12 ... topk12_pertensor_mlp ... [MLP] training on 768 (block,task) pairs of dim 6 ep 0 mse=1.44209 ep 50 mse=0.83229 ep 100 mse=0.68558 ep 150 mse=0.66112 ep 200 mse=0.64794 ep 250 mse=0.63584 ep 299 mse=0.62254 Repo card metadata block was not found. Setting CardData to empty. base_Y=0.1933 procrustes cos=0.9715 acc=0.2000 topk5_global_ridge cos=0.9643 acc=0.1933 topk5_pertensor_ridge cos=0.9639 acc=0.1933 topk8_global_ridge cos=0.9710 acc=0.2467 topk8_pertensor_ridge cos=0.9702 acc=0.3600 topk12_global_ridge cos=0.9721 acc=0.2033 topk12_pertensor_ridge cos=0.9710 acc=0.2233 topk12_pertensor_mlp cos=0.9705 acc=0.2267 oracle_Y=0.4533 === AVG (new methods) === base_Y 0.3127 procrustes 0.3007 topk5_global_ridge 0.3167 topk5_pertensor_ridge 0.3147 topk8_global_ridge 0.3260 topk8_pertensor_ridge 0.3487 topk12_global_ridge 0.3273 topk12_pertensor_ridge 0.3280 topk12_pertensor_mlp 0.3113 oracle_Y 0.5067