AI & ML interests

None defined yet.

Recent Activity

Xetro  updated a collection about 3 hours ago
NSFW - Text to Image
Xetro  updated a collection about 3 hours ago
NSFW - Text to Image
Xetro  updated a collection about 3 hours ago
NSFW - Text to Image
View all activity

Xetro 
in UnfilteredAI/README about 3 hours ago

Is UnfilteredAI dead now?

1
#3 opened about 14 hours ago by
Abhaykoul
KingNish 
posted an update about 2 months ago
view post
Post
2631
Muon vs MuonClip vs Muon+Adamw

Muon has gone from an experiment to a mainstream optimizer, but does it hold up for fine‑tuning? We ran head‑to‑head tests on Qwen3‑4B (10k+ high‑quality instruction rows) to find out.

Short story: Pure Muon converged fastest at the start, but its gradient‑norm spikes made training unstable. MuonClip (Kimi K2’s clipping) stabilizes long pretraining runs, yet in our small‑scale fine‑tune it underperformed, lower token accuracy and slower convergence. The winner was the hybrid: Muon for 2D layers + AdamW for 1D layers. It delivered the best balance of stability and final performance and even beat vanilla AdamW.

Takeaway: for small-scale fine-tuning, hybrid = practical and reliable.

Next Step: scale to larger models/datasets to see if Muon’s spikes become catastrophic or if clipping wins out.

Full Blog Link: https://huggingface.co/blog/KingNish/optimizer-part1
KingNish 
posted an update about 2 months ago
Abhaykoul 
in UnfilteredAI/DAN 3 months ago

Update README.md

#2 opened 3 months ago by
Deathstormer
Abhaykoul 
in HelpingAI/HELVETE-3B 5 months ago

architecture

2
#7 opened 10 months ago by
BornSaint