Is this simply a weighted average of each parameter? Could you share the merging code please? thanks!
https://github.com/TehVenomm/LM_Transformers_BlockMerge
· Sign up or log in to comment