sha-index

The development of SnowflakeCore-G1-7B-MoE it getting delay. In the mean time I am working on SnowflakeCore-G1-1B-MoE witch would be a pre-train chatbot.

1 reply

FlameF0X

posted an update 10 months ago

Post

2958

The development of SnowflakeCore-G1-7B-MoE. I can't say when it would be publish yet because it's big and it requires a lot of computational power.

1 reply

FlameF0X

posted an update 10 months ago

Post

293

I just finished the benchmarks for https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny and https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny2 in comparation with openai-community/gpt2 .

FlameF0X

posted an update 10 months ago

Post

315

Hello! Important announcement, I will rename SnowflakeCore-G1-Medium to SnowflakeCore-G1-Tiny2 because it's going to have the same parameters as the Tiny version, but this one is trained on more data.

1 reply

FlameF0X

posted an update 10 months ago

Post

747

Currently working on SnowflakeCore-G1-Medium. [Updated loss cruve]

3 replies

FlameF0X

posted an update 10 months ago

Post

157

Hello there world! I am happy to announce that you now can fine-tune https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny , the code for that is in the model card.

I aslo lost the training log 😐

FlameF0X

posted an update 10 months ago

Post

1208

Hello! I am sad to say but fine-tuning https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny is complicated and the instruct version would need to wait some time.

2 replies

FlameF0X

posted an update 11 months ago

Post

231

SnowflakeCore-G1-Tiny has landed on Hugging Face! 🚀. Give it a try and let me know what you think: https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny.

FlameF0X

posted an update 11 months ago

Post

258

SnowflakeCore-G1 Update:
Got it running and training! Context window is currently set to 2048 tokens.
Training is active and stable. Will share results once I have some metrics to report.

2 replies

FlameF0X

posted an update 11 months ago

Post

1940

SnowflakeCore-G1 development update: We're building a 24-layer transformer with 32K context and 1024 embedding dimensions - pretty ambitious! Even running at batch_size=1 with heavy gradient accumulation, we're hitting memory walls at 300GB RAM. Scaling up to ~1TB will take some time, but the architecture is looking promising. Thanks for following along with the journey! 😅

1 reply

FlameF0X

posted an update 11 months ago

Post

1154

Hello there!
I just find out that all the SnowflakeCore-G0 series are Mask Language Models instead of LLM's.
The development of SnowflakeCore-G0-Releas-3 would be delayed even more.

Edit: I officially end the development of SnowflakeCore-G0 and start the development of SnowflakeCore-G1 what SHOULD be the text generator.

Edit-2: After some evaluation of the code, the models are actual Text Generator. So the development of G0 will continue.

AI & ML interests

Recent Activity

Team members 1

SHA-index's activity

Search SHA

[bot] Conversion to Parquet

SHA-Index

SHA-Index

Search SHA