Instructions to use TheBloke/NewHope-GGML with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TheBloke/NewHope-GGML with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("TheBloke/NewHope-GGML", dtype="auto") - Notebooks
- Google Colab
- Kaggle
should i expect lower accuracy from the original model
#2
by YairFr - opened
but sending the same prompt, one time to the original model , using LlamaForCausalLM.from_pretrained
and one time via wrapping the ggml model via Llamacpp and use it in langchain's AgentExecutor - i get different (and worse) results
Can you provide an example of those differences?
what i wrote here https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/discussions/9 is relevant also for this model.