Instructions to use TheBloke/orca_mini_7B-GGML with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TheBloke/orca_mini_7B-GGML with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("TheBloke/orca_mini_7B-GGML", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Loading on M2
#1
by cooki3monster - opened
I get the following error when trying to load on M2 metal with llama.cpp
Asserting on type 8
GGML_ASSERT: ggml-metal.m:706: false && "not implemented"
GGML_ASSERT: ggml-metal.m:758: false && "not implemented"
Abort trap: 6
Model works find when running llama.cpp without ngl flag
Thanks for all your work