Transformers-compatible checkpoint
Hi! This is such a fun project!
I just wanted to request a transformers-compatible checkpoint to make it easier for people to use and convert the model to other formats (I want to create an ONNX export, for example).
There is a conversion script which should be handy: https://github.com/huggingface/transformers/blob/02063e683595e4a3e7f4e5be2fee17cab129e4bb/src/transformers/models/nanochat/convert_nanochat_checkpoints.py
and
@burtenshaw has a nice article at https://huggingface.co/spaces/nanochat-students/transformers (maybe he can help out too!)
Thanks! I'll get to it and let you know when I have!
Great to hear! thanks!
After looking into this a little, it seems like the convert_nanochat_checkpoints script hasn't been updated to accommodate a few of the latest Nanochat architectural features used to train the original model. Conversion might be a little more involved than first anticipated - I'll let you know!
After further investigation, I've decided that what I'm going to do is train a new base and instruct-tuned model using transformers and Llama architecture, and upload those as fresh checkpoints. They will be slightly larger, trained on much more data, and better able to handle the representative sample of user requests I've encountered while this is live. Closing this thread for now, and I'll let you know when the new models are uploaded!