Instructions to use TheBloke/starchat-beta-GGML with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TheBloke/starchat-beta-GGML with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("TheBloke/starchat-beta-GGML", dtype="auto") - Notebooks
- Google Colab
- Kaggle
special tokens in prompt with ggml/examples/starcoder
#3
by mljxy - opened
Using the starcoder example in ggml, the special tokens in prompt does not got tokenized correctly. For example,
main: token[0] = 46, <
main: token[1] = 110, |
main: token[2] = 2946, system
main: token[3] = 28318, |>
The correct tokenization should map <|system|> to 49152 instead. The same incorrect tokenizations happen to <|user|>, <|assistant|>, and <|end|>.
This was fixed last week: https://github.com/ggerganov/ggml/commit/e456108433017d5586b35fd36ce781b4c3aed631
But only kinda-sorta fixed I think, there's still somethign up here I can't get SantaCoder to spit out token 49152 (<|end|>) the GGML inference diverges from what the HF model does.