DLM variant?

#21
by danieledll - opened

I wonder how well would it perform if it would be converted to a DLM, here an interesting solution that was applied on Qwen with amazing results both in terms of performance and in terms of retention
https://github.com/tencent/WeDLM

Sign up or log in to comment