The development of SnowflakeCore-G1-7B-MoE it getting delay. In the mean time I am working on SnowflakeCore-G1-1B-MoE witch would be a pre-train chatbot.
Hello! Important announcement, I will rename SnowflakeCore-G1-Medium to SnowflakeCore-G1-Tiny2 because it's going to have the same parameters as the Tiny version, but this one is trained on more data.
SnowflakeCore-G1 Update: Got it running and training! Context window is currently set to 2048 tokens. Training is active and stable. Will share results once I have some metrics to report.
SnowflakeCore-G1 development update: We're building a 24-layer transformer with 32K context and 1024 embedding dimensions - pretty ambitious! Even running at batch_size=1 with heavy gradient accumulation, we're hitting memory walls at 300GB RAM. Scaling up to ~1TB will take some time, but the architecture is looking promising. Thanks for following along with the journey! ๐
Hello there! I just find out that all the SnowflakeCore-G0 series are Mask Language Models instead of LLM's. The development of SnowflakeCore-G0-Releas-3 would be delayed even more.
Edit: I officially end the development of SnowflakeCore-G0 and start the development of SnowflakeCore-G1 what SHOULD be the text generator.
Edit-2: After some evaluation of the code, the models are actual Text Generator. So the development of G0 will continue.