output = llm(
"Once upon a time,",
max_tokens=512,
echo=True
)
print(output)sentence-transformers==2.7.0
$ cat load.py
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
$ TRANSFORMERS_CACHE=$PWD/cache python load.py
./llama.cpp/convert-hf-to-gguf.py \
cache/models--sentence-transformers--all-MiniLM-L6-v2/snapshots/e4ce9877abf3edfe10b0d82785e83bdcb973e22e/ \
--outfile all-MiniLM-L6-v2.e4ce9877.{ftype}.gguf \
--outtype=f32
- Downloads last month
- 109
Hardware compatibility
Log In to add your hardware
8-bit
16-bit
32-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="asg017/sqlite-lembed-model-examples", filename="", )