The reaction on the QAT post is beyond expectations so below is my optimizer post as promised. But I found that I had lots of explanation to do about optimizer itself. So this post is actually a historical recount. The Muon optimizer (used by Kimi) post (coming very soon) can only continue after this.