Instructions to use iampanda/zpoint_large_embedding_zh with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use iampanda/zpoint_large_embedding_zh with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("iampanda/zpoint_large_embedding_zh") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
what is the best chunk size
#9
by hulianxue - opened
considering such application scenario:
i have long-content text which of-course could not be input into embedding model at a time.
so i have to cut text into chunks; embed them; and push embeddings into vector-recall system.
so in order to achieve best recalling performance, what is the best chunk size ?
do you have any experiment on this?
or any suggestion about this according to your training data distribution?
thx!