Superpositional Gradient Descent: Harnessing Quantum Principles for Model Training Paper โข 2511.01918 โข Published Nov 1, 2025 โข 11
view post Post 2405 I've just distilled Llama-3.2-3B-Instruct with deepseek-ai/DeepSeek-R1 on ServiceNow-AI/R1-Distill-SFT dataset. ๐๐ฆHere is the model: suayptalha/DeepSeek-R1-Distill-Llama-3B See translation ๐ 3 3 + Reply
view post Post 3567 My last Falcon3-7B merge model, suayptalha/Falcon3-Jessi-v0.4-7B-Slerp, is currently ranked #1 on the open-llm-leaderboard/open_llm_leaderboard among all models with up to 14B parameters.My Qwen2.5-7B merge model, suayptalha/HomerCreativeAnvita-Mix-Qw7B, is also ranked #7, placing two of my models in the top 10! See translation 3 replies ยท ๐ 3 3 + Reply