PatriSBERT

A Sentence-BERT model fine-tuned for semantic textual similarity on patristic and biblical Latin texts. It is designed to detect and measure text reuse between early Christian writings and the Vulgate Bible.

Usage

from sentence_transformers import SentenceTransformer, util

model = SentenceTransformer("tdelaselle/PatriSBERT")

sentences = [
    "In principio erat Verbum",
    "Et Verbum caro factum est",
]
embeddings = model.encode(sentences)
similarity = util.cos_sim(embeddings[0], embeddings[1])
print(f"Cosine similarity: {similarity.item():.4f}")

Training

Base model: PatriSBERT-NLI (SBERT model trained on NLI-type latin biblical reuses dataset) from the PatriBERT model (BERT pre-trained on latin patristic texts)
Task: Semantic textual similarity (STS) via triplet fine-tuning
Dataset: Latin biblical reuse triplets

Evaluation

See the eval/ folder for evaluation metrics on the held-out test set.

Citation

If you use this model, please cite:

@misc{patriSBERT2026,
  author = {TdelaSelle},
  title  = {PatriSBERT},
  year   = {2026},
  url    = {https://huggingface.co/TdelaSelle/PatriSBERT}
}

Downloads last month: 27

Safetensors

Model size

0.1B params

Tensor type

F32

TdelaSelle
/

PatriSBERT

PatriSBERT

Usage

Training

Evaluation

Citation

Space using TdelaSelle/PatriSBERT 1