Smaller version of Starcoder

#53

by jiang719 - opened Jun 12, 2023

Jun 12, 2023

StarCoder is indeed the state-of-the-art from my using experience on several tasks.
But the 15.5B model is too large for some personal use case.
Do you consider pre-train and release smaller versions, say 3B, 7B. I would really appreciate that.

junliu44

Jun 13, 2023

try to load_in_8bit?
https://huggingface.co/docs/transformers/main_classes/quantization

Bilibili

Jun 19, 2023

There is a TinyStarCoder: https://huggingface.co/bigcode/tiny_starcoder_py, but it seems too small (164M) to have a good performance

loubnabnl

BigCode org Jun 19, 2023

Yes we'll probably release smaller checkpoints in the range of 3B-7B in the upcoming months.

noobmldude

Jun 19, 2023

it would be great to have smaller checkpoint in 3B-7B range.

loubnabnl

BigCode org Oct 5, 2023

1B, 3B and 7B models were released few months ago

loubnabnl changed discussion status to closed Oct 5, 2023

noobmldude

Oct 5, 2023

Thanks. found these:
1B: https://huggingface.co/bigcode/starcoderbase-1b
3B: https://huggingface.co/bigcode/starcoderbase-3b
7B: https://huggingface.co/bigcode/starcoderbase-7b

are the megatron weights planned to be released as well?
Or is there a way to convert these to Megatron, so we could finetune using bigCode/Megatron-LM ?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment