PatriSBERT-STS

A Sentence-BERT model fine-tuned for semantic textual similarity on patristic and biblical Latin texts. It is designed to detect and measure text reuse between early Christian writings and the Vulgate Bible.

/!\ Work in progress: this is a draft version of PatriSBERT previously released for experiments. Its current performances are provisional.

Usage

from sentence_transformers import SentenceTransformer, util

model = SentenceTransformer("tdelaselle/PatriSBERT-STS")

sentences = [
    "In principio erat Verbum",
    "Et Verbum caro factum est",
]
embeddings = model.encode(sentences)
similarity = util.cos_sim(embeddings[0], embeddings[1])
print(f"Cosine similarity: {similarity.item():.4f}")

Training

  • Base model: PatriSBERT-NLI (SBERT model trained on NLI-type latin biblical reuses dataset) from the PatriBERT model (BERT pre-trained on latin patristic texts)
  • Task: Semantic textual similarity (STS) via triplet fine-tuning
  • Dataset: Latin biblical reuse triplets

Evaluation

See the eval/ folder for evaluation metrics on the held-out test set.

Citation

If you use this model, please cite:

@misc{patriSBERT2026,
  author = {TdelaSelle},
  title  = {PatriSBERT-STS},
  year   = {2026},
  url    = {https://huggingface.co/TdelaSelle/PatriSBERT-STS}
}
Downloads last month
51
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using TdelaSelle/PatriSBERT-STS 1