目前是基于一个GPU训练的版本,架构采用Yarn,GQA,MOE(可选), 预训练和后训练(SFT和GRPO)的数据包含文本,代码,图像。目前是没有图文训练过程。

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support