Instructions to use aari1995/German_Semantic_STS_V2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use aari1995/German_Semantic_STS_V2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("aari1995/German_Semantic_STS_V2") sentences = [ "Das ist eine glückliche Person", "Das ist ein glücklicher Hund", "Das ist eine sehr glückliche Person", "Heute ist ein sonniger Tag" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use aari1995/German_Semantic_STS_V2 with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("aari1995/German_Semantic_STS_V2") model = AutoModel.from_pretrained("aari1995/German_Semantic_STS_V2") - Inference
- Notebooks
- Google Colab
- Kaggle
Domain adaptation
Hey Aaron,
this model is great!
If I wanted to adapt this to a special domain, like German medical texts, what would be my best path forward?
Can I domain-adapt with unlabelled data, i.e. Gigabytes worth of text? Or would I have to generate labels (query, positive, negative)?
Can you give me some pointers?
Hi,
thanks!
Depending on the task. For semantic similarity / semantic search you could further finetune the model with labels. Note that the similarity should be between 0 (no similarity) and 1 (the same).
Also, consider adding tokens for example for special words before finetuning. If you want to stick with sentence-transformers this could help:
https://github.com/UKPLab/sentence-transformers/issues/744
Also, if you have enough resources you could train from scratch. Or just use a pretrained, smaller model:
https://huggingface.co/GerMedBERT/medbert-512
for semantic similarity you could either use your labeled dataset or also fine-tune on STS to prime the model on the task. First one would be more promising.
All the best
Aaron