|
|
--- |
|
|
tags: |
|
|
- sentence-transformers |
|
|
- sentence-similarity |
|
|
- feature-extraction |
|
|
- dense |
|
|
- generated_from_trainer |
|
|
- dataset_size:99840 |
|
|
- loss:MultipleNegativesRankingLoss |
|
|
widget: |
|
|
- source_sentence: na. es. sui .s. siqs aut̃. It nrͣ hodie ꝙ ea demũ sit ma. ⁊ a. |
|
|
g̃iadħ de ni ma. ⁊ a. siadħ de nolib ubis ⁊ młrib e Uia ex ꝯcubinis filu nascũt᷑ |
|
|
uales ⁊ te nalib faluꝰ usdeimꝰ ⁊ de mr̃ib eoꝵ .s. qui dicãt᷑ nales. ⁊ ꝙͣtũ |
|
|
pocut eiꝰ relinqͥ. ł int̾ iuiuoꝵ. ⁊ ĩ ucima nol̃tate quoqͣ.tdari ⁊ postea ꝓseqũ |
|
|
teꝰ denorẽ ice. dicemꝰ quib ex cai lb̾i uales fi ant sus .i. redigãt᷑ in potatẽ |
|
|
ꝑentũ. ⁊ de h tͣctatu |
|
|
sentences: |
|
|
- na. es. sui .s. siqs aut̃. It nrͣ hodie ꝙ ea demũ sit ma. ⁊ a. g̃iadħ de ni ma. |
|
|
⁊ a. siadħ de nolib ubis ⁊ młrib e Uia ex ꝯcubinis filu nascũt᷑ uales ⁊ te |
|
|
nalib faluꝰ usdeimꝰ ⁊ de mr̃ib eoꝵ .s. qui dicãt᷑ nales. ⁊ ꝙͣtũ pocut eiꝰ |
|
|
relinqͥ. ł int̾ iuiuoꝵ. ⁊ ĩ ucima nol̃tate quoqͣ.tdari ⁊ postea ꝓseqũ teꝰ denorẽ |
|
|
ice. dicemꝰ quib ex cai lb̾i uales fi ant sus .i. redigãt᷑ in potatẽ ꝑentũ. |
|
|
⁊ de h tͣctatu |
|
|
- 'illius excubaret: ibidem ꝓ fide xp̃i aꝑsecutorib tradita est qi cum digna & |
|
|
eumenia: & eupe Ciuitate falare: passio scõtu graciliani. & felicissime iurg |
|
|
nis. Quoꝵ ora ꝓxp̃o contusi lapidib. dehinc gladio ꝑcusi optatam martytii suscepert̃ |
|
|
palmam. idus augusti.' |
|
|
- 'Et nos Poncius Ugo, Dei gracia Impuriarum comes predictus, promitimus vobis Raimundo |
|
|
Xetmario, nomine dicte domne Marchisie, predictam forciam deffendere ab omni homine |
|
|
qui a te directum accipere noluerit vel facere. Assigno etiam vobis et dono in |
|
|
feudum, in esmendam dicti careu, dictos V squillatas milii, annuatim accipiendas |
|
|
in festo Omnium Sanctorum, in omnibus nostris directis et taschis quas accipimus |
|
|
in stagno de Cils. Actum est hoc VII kalendas novembris anno Domini MºCCºLXXº |
|
|
octavo. Sig(+)num Raimundi Xetmarii predicti, qui hoc firmo et laudo. Sig(+)num |
|
|
Ponci Ugonis, Dei gracia comis Impuriarum predicti, qui hoc firmamus et laudamus. |
|
|
Testes huius rei sunt: Bernardus de Palaciolo de Villanova, et Berengarius de |
|
|
Lanciano, et Guilelmus Alferici et Simon de Trilia, milites.' |
|
|
- source_sentence: co segnito se dicitcis. Quid qͣntis. Qiditeit ti. nabiꝙ s int̾prctatum |
|
|
magr̃ubhtͥtał. di ca cis docnite r itdete uencrunt ⁊ intertubi ibi manceto ⁊ mancrunt |
|
|
sbidit illo. Hora aut̃ trar q̃i deciina. Eint auł anoicas tẽ srmo ni petͥ unis |
|
|
ce ouioui qui aud itrant a ioilt ⁊ situti fucrant guns. Inicrut hu trem suilss |
|
|
mont̃ ⁊ dicit ci. quutumui inesilq intͥpꝰ fͤ Et arduxi cum ad ihͥm. Intuit uit᷑issctil |
|
|
ois. |
|
|
sentences: |
|
|
- 'Si qua Deos tangit pietas, Astraeaque vivit, Castigatque Reum torto Rhamnusia |
|
|
plumbo: Et te poena manet Physignathe, nec Rhadamanthi Effugies uncum.' |
|
|
- Sic absentibus ducibus praedictis et caeteris regni primatibus, reconciliatus |
|
|
est rex Saxonibus simulatorie, et cum eis ad usque Goslariam pervenit, non multum |
|
|
tamen confidens in illis. Roudolfus dux et caeteri rebelles reconciliantur regi. |
|
|
- co segnito se dicitcis. Quid qͣntis. Qiditeit ti. nabiꝙ s int̾prctatum magr̃ubhtͥtał. |
|
|
di ca cis docnite r itdete uencrunt ⁊ intertubi ibi manceto ⁊ mancrunt sbidit |
|
|
illo. Hora aut̃ trar q̃i deciina. Eint auł anoicas tẽ srmo ni petͥ unis ce ouioui |
|
|
qui aud itrant a ioilt ⁊ situti fucrant guns. Inicrut hu trem suilss mont̃ ⁊ dicit |
|
|
ci. quutumui inesilq intͥpꝰ fͤ Et arduxi cum ad ihͥm. Intuit uit᷑issctil ois. |
|
|
- source_sentence: 'israhelitieo popło inͦ deserto p̃cessit: ipse in euan geliis deserta |
|
|
gentiũ uisitauit: Et ꝑ quẽ tunc in srna huic generationi manna defluxit: ipse |
|
|
xp̃ianȩ genti. corporis sanguinis sui manna in xecclesia subministrat.' |
|
|
sentences: |
|
|
- Sed ne quid videretur omissum, aut nostro potuisset dubium cordi remanere, ad |
|
|
beati Petri sacratissimum corpus districta eum ex abundanti fecimus sacramenta |
|
|
praebere. Quibus praestitis, magna sumus exsultatione gavisi, quod hujuscemodi |
|
|
experimento innocentia ejus evidenter enituit. Pro qua re gloria vestra praedictum |
|
|
virum cum omni charitate suscipiat, et reverentiam ei, qualem sacerdoti decet, |
|
|
exhibeat, nec quaedam cordibus remaneat de iis quae sunt jam purgata dubietas. |
|
|
Sed ita suprascripto vos episcopo devotissime oportet in omnibus adhaerere, ut |
|
|
congrue decenterque Deum in ejus persona cujus minister est videamini honorare. |
|
|
EPISTOLA XXXIV. |
|
|
- La char d'une joe de beuf tranchee par lesches et mise en pasté, et puis, quant |
|
|
elle est cuicte, gecter la saulse d'un halebran dedens. En la haste menue d'un |
|
|
pourcel n'a aucun appareil a faire fors le laver et embrocher et enveloper de |
|
|
sa taye, et cuire longuement. Poules farcies coulourees ou dorees. |
|
|
- 'israhelitieo popło inͦ deserto p̃cessit: ipse in euan geliis deserta gentiũ |
|
|
uisitauit: Et ꝑ quẽ tunc in srna huic generationi manna defluxit: ipse xp̃ianȩ |
|
|
genti. corporis sanguinis sui manna in xecclesia subministrat.' |
|
|
- source_sentence: 'ad renonationem ietłm ꝑueniret. Ee xagesim oita scdo anno regno |
|
|
rogis chasdeoꝵ subũso. cui ad huc octo anti: ad regnandͥ: restabant. oꝵ etiam |
|
|
sexta insio damelis ostendit pp̃ cãm : ettinctim. datiꝰ qui medis imꝑabat u septima |
|
|
uisio danieł ostendit memoratori tegno fucces sit. Cmꝰ pͥmo anno regtui: ut decimauisio |
|
|
ba ncł ostendit. supputatis unis uidens aꝓpĩ. qͣre tempꝰ reũsionis. que ꝑ leremiã |
|
|
fũat ꝓtais.' |
|
|
sentences: |
|
|
- 'Nos igitur attendentes, quòd ad Religionem conversi, si fuerint in suis locis |
|
|
laudabiliter conversati, illegitimitatis macula non obstante, juris permissione |
|
|
licenter possunt ad Ordines promoveri b , discretioni tuæ præsentium auctoritate |
|
|
committimus, quatenus cum eodem Presbytero, ejus ad hoc suffragantibus meritis, |
|
|
super quibus tuam intendimus conscientiam onerare, quòd hujusmodi non obstante |
|
|
defectu in susceptis Ordinibus ministrare, & ad regulares administrationes dicti |
|
|
Ordinis dumtaxat assumi valeat, auctoritate nostra dispenses, prout secundùm Deum |
|
|
animæ suæ saluti videris expedire: Ita tamen quòd dictus Conradus nullatenus præficiatur |
|
|
in eodem Ordine in Ministrum . Datum Romæ apud Sanctam Mariam Majorem sexto Nonas |
|
|
Maii, Pontificatus nostri Anno Secundo.' |
|
|
- Nam intellectum proferentis in eo significare uox dicitur, quod ipsum auditori |
|
|
manifestat, dum consimilem in auditore generat. Unde Priscianus articulatam, id |
|
|
est significatiuam, uocem esse dicit, quae coartatur cum sensu proferentis, id |
|
|
est quam ipse proferens intendit proferre ad manifestandum intellectum suum. In |
|
|
quo quidem uocem articulatam, id est significatiuam, eum accipere dicunt quantum |
|
|
ad intellectum proferentis quem manifestat, non ad intellectum auditoris, quem |
|
|
generat. |
|
|
- 'ad renonationem ietłm ꝑueniret. Ee xagesim oita scdo anno regno rogis chasdeoꝵ |
|
|
subũso. cui ad huc octo anti: ad regnandͥ: restabant. oꝵ etiam sexta insio damelis |
|
|
ostendit pp̃ cãm : ettinctim. datiꝰ qui medis imꝑabat u septima uisio danieł |
|
|
ostendit memoratori tegno fucces sit. Cmꝰ pͥmo anno regtui: ut decimauisio ba |
|
|
ncł ostendit. supputatis unis uidens aꝓpĩ. qͣre tempꝰ reũsionis. que ꝑ leremiã |
|
|
fũat ꝓtais.' |
|
|
- source_sentence: 'cta test̾i sur: u̾bi qd̾ madauit in mil egones. Q uod disposunt |
|
|
ad abrahã. mũti sui ad p̃saaci Et statuit il acob ĩ p̾ceptũ: ⁊ isrł mn testiñ |
|
|
etꝰ Dices tibi dabo t̾ram chanaan: fu ncdũ heditatis ur̃e. Dũ e̾e̾nt nũo ocui. |
|
|
paucissimi ⁊ ĩcole ouis. Et ꝑtͣni eẽt de gnͣte ĩ gentẽ: ⁊ de regno ad ulũ |
|
|
alterũ. Non reliquit hoĩem' |
|
|
sentences: |
|
|
- p̾mioꝵ. p̃s. b̾ildixit finis tuus inte. Et ĩminitas apee cato. Qua xp̃c donatus |
|
|
e̾ ps. p̾ucinsti eũ i bñ. dicidis ¶Infernans quo dupiex .s. adinacio. ps. laudat᷑ͣ |
|
|
ptc̃ce indesidus aĩe sue ⁊ ñquis bñdi. et cũ quis sibi tribuit bona que ht̃ |
|
|
atco. Iob. timebat enĩ ne forte peccau̾int fuii eius. ⁊ bñdix̾int deo incordib |
|
|
suis. Corꝑans ẽ ad carnis delecta tr̃em us. or̃s caro feñ. ⁊ oĩs gła euis qiͣ |
|
|
d̾r ꝑ ysaiam. ue qͥ niungitis domũ addom̃. ⁊ agr̃ ago copłatis us ad t̾minũ |
|
|
ioci. Nñquid ħ̾itabitis uos so |
|
|
- 'cta test̾i sur: u̾bi qd̾ madauit in mil egones. Q uod disposunt ad abrahã. mũti |
|
|
sui ad p̃saaci Et statuit il acob ĩ p̾ceptũ: ⁊ isrł mn testiñ etꝰ Dices tibi |
|
|
dabo t̾ram chanaan: fu ncdũ heditatis ur̃e. Dũ e̾e̾nt nũo ocui. paucissimi |
|
|
⁊ ĩcole ouis. Et ꝑtͣni eẽt de gnͣte ĩ gentẽ: ⁊ de regno ad ulũ alterũ. Non |
|
|
reliquit hoĩem' |
|
|
- 'uintia orientalium anglorũque fuerint gesta eccdesi astica pastim & strap̃tis |
|
|
ul tra diqone pranũ partim re uerentis simi abbatis ̃ relatione conperimus: Atiuero |
|
|
in prouintia lindissi que sint gesta eo fidemxp̃i queue suc cessia sacerdotalis |
|
|
eꝓterit uollitteris reueren tissii antisteris ciniberti uł aliorũ fidebũ uirorum |
|
|
anagubee alidigimus: D auterinnoa danhrsibrorum prouintia eo quo rem pu fueiu |
|
|
api ꝑceperunt us quea sp̃ sens ꝑduersas regiones: ineccle siitio aecta non uno |
|
|
quolibet auqtere sed fidelimnumerta umtestiũ nulie csar uel neminis sepoterant |
|
|
adsertione cagnoui: eceptis his' |
|
|
pipeline_tag: sentence-similarity |
|
|
library_name: sentence-transformers |
|
|
--- |
|
|
|
|
|
# SentenceTransformer |
|
|
|
|
|
This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
- **Model Type:** Sentence Transformer |
|
|
<!-- - **Base model:** [Unknown](https://huggingface.co/unknown) --> |
|
|
- **Maximum Sequence Length:** 8192 tokens |
|
|
- **Output Dimensionality:** 768 dimensions |
|
|
- **Similarity Function:** Cosine Similarity |
|
|
<!-- - **Training Dataset:** Unknown --> |
|
|
<!-- - **Language:** Unknown --> |
|
|
<!-- - **License:** Unknown --> |
|
|
|
|
|
### Model Sources |
|
|
|
|
|
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net) |
|
|
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) |
|
|
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) |
|
|
|
|
|
### Full Model Architecture |
|
|
|
|
|
``` |
|
|
SentenceTransformer( |
|
|
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'}) |
|
|
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) |
|
|
) |
|
|
``` |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Direct Usage (Sentence Transformers) |
|
|
|
|
|
First install the Sentence Transformers library: |
|
|
|
|
|
```bash |
|
|
pip install -U sentence-transformers |
|
|
``` |
|
|
|
|
|
Then you can load this model and run inference. |
|
|
```python |
|
|
from sentence_transformers import SentenceTransformer |
|
|
|
|
|
# Download from the 🤗 Hub |
|
|
model = SentenceTransformer("sentence_transformers_model_id") |
|
|
# Run inference |
|
|
sentences = [ |
|
|
'cta test̾i sur: u̾bi qd̾ madauit in mil egones. Q uod disposunt ad abrahã. mũti sui ad p̃saaci Et statuit il acob ĩ p̾ceptũ: ⁊ isrł mn testiñ etꝰ Dices tibi dabo t̾ram chanaan: fu ncdũ heditatis ur̃e. Dũ e̾e̾nt nũo ocui. paucissimi ⁊ ĩcole ouis. Et ꝑtͣni eẽt de gnͣte ĩ gentẽ: ⁊ de regno ad ulũ alterũ. Non reliquit hoĩem', |
|
|
'cta test̾i sur: u̾bi qd̾ madauit in mil egones. Q uod disposunt ad abrahã. mũti sui ad p̃saaci Et statuit il acob ĩ p̾ceptũ: ⁊ isrł mn testiñ etꝰ Dices tibi dabo t̾ram chanaan: fu ncdũ heditatis ur̃e. Dũ e̾e̾nt nũo ocui. paucissimi ⁊ ĩcole ouis. Et ꝑtͣni eẽt de gnͣte ĩ gentẽ: ⁊ de regno ad ulũ alterũ. Non reliquit hoĩem', |
|
|
'p̾mioꝵ. p̃s. b̾ildixit finis tuus inte. Et ĩminitas apee cato. Qua xp̃c donatus e̾ ps. p̾ucinsti eũ i bñ. dicidis ¶Infernans quo\uf1ac dupiex .s. adinacio. ps. laudat᷑ͣ ptc̃ce indesidus aĩe sue ⁊ ñquis bñdi. et cũ quis sibi tribuit bona que ht̃ atco. Iob. timebat enĩ ne forte peccau̾int fuii eius. ⁊ bñdix̾int deo incordib\uf1ac suis. Corꝑans ẽ ad carnis delecta tr̃em us. or̃s caro feñ. ⁊ oĩs gła euis qiͣ d̾r ꝑ ysaiam. ue qͥ niungitis domũ addom̃. ⁊ agr̃ ago copłatis us\uf1ac ad t̾minũ ioci. Nñquid ħ̾itabitis uos so', |
|
|
] |
|
|
embeddings = model.encode(sentences) |
|
|
print(embeddings.shape) |
|
|
# [3, 768] |
|
|
|
|
|
# Get the similarity scores for the embeddings |
|
|
similarities = model.similarity(embeddings, embeddings) |
|
|
print(similarities) |
|
|
# tensor([[1.0000, 1.0000, 0.2812], |
|
|
# [1.0000, 1.0000, 0.2812], |
|
|
# [0.2812, 0.2812, 1.0000]]) |
|
|
``` |
|
|
|
|
|
<!-- |
|
|
### Direct Usage (Transformers) |
|
|
|
|
|
<details><summary>Click to see the direct usage in Transformers</summary> |
|
|
|
|
|
</details> |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
### Downstream Usage (Sentence Transformers) |
|
|
|
|
|
You can finetune this model on your own dataset. |
|
|
|
|
|
<details><summary>Click to expand</summary> |
|
|
|
|
|
</details> |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
### Out-of-Scope Use |
|
|
|
|
|
*List how the model may foreseeably be misused and address what users ought not to do with the model.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
## Bias, Risks and Limitations |
|
|
|
|
|
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
### Recommendations |
|
|
|
|
|
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.* |
|
|
--> |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Dataset |
|
|
|
|
|
#### Unnamed Dataset |
|
|
|
|
|
* Size: 99,840 training samples |
|
|
* Columns: <code>sentence_0</code> and <code>sentence_1</code> |
|
|
* Approximate statistics based on the first 1000 samples: |
|
|
| | sentence_0 | sentence_1 | |
|
|
|:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------| |
|
|
| type | string | string | |
|
|
| details | <ul><li>min: 6 tokens</li><li>mean: 85.65 tokens</li><li>max: 473 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 85.65 tokens</li><li>max: 473 tokens</li></ul> | |
|
|
* Samples: |
|
|
| sentence_0 | sentence_1 | |
|
|
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
|
|
| <code>Per totum namque mundum est mundus; et mundum persequitur mundus, coinquinatus mundum, perditus redemptum, damnatus salvatum.</code> | <code>Per totum namque mundum est mundus; et mundum persequitur mundus, coinquinatus mundum, perditus redemptum, damnatus salvatum.</code> | |
|
|
| <code>motꝰ siait supͣ sepe dixmꝰ gꝰ anteon nem generanonem est motus ge eti am aute generaitionem primi mobilis est mo tus go etiam motus est. inte p̾mum mo tum᷑ ꝙ est impossibile go fint hec caisa ꝙ motus non eet̾ sꝑ momĩ p̾tito iprẽ ꝙ primum mobile oportet᷑ prius generari mẽe et postea moneri qr absq dubio se queret᷑ ꝙ quedam mutatio eet̃ anteil</code> | <code>motꝰ siait supͣ sepe dixmꝰ gꝰ anteon nem generanonem est motus ge eti am aute generaitionem primi mobilis est mo tus go etiam motus est. inte p̾mum mo tum᷑ ꝙ est impossibile go fint hec caisa ꝙ motus non eet̾ sꝑ momĩ p̾tito iprẽ ꝙ primum mobile oportet᷑ prius generari mẽe et postea moneri qr absq dubio se queret᷑ ꝙ quedam mutatio eet̃ anteil</code> | |
|
|
| <code>Dictum est, id quod in nomine confuse significaretur, in definitione quae fit enumeratione partium, aperiri atque explicari. Quod fieri non potest, nisi per quarumdam partium nuncupationem; nihil enim dum explicatur oratione, totum simul dici potest. Quae cum ita sint, cumque omnis hujusmodi definitio quaedam sit partium distributio, quatuor his modis fieri potest. Aut enim substantiales partes explicantur, aut proprietatis partes dicuntur, aut quasi totius membra enumerantur, aut tanquam species dividuntur. Substantiales partes explicantur, cum ex genere ac differentiis definitio constituitur. Genus enim quod singulariter praedicatur, speciei totum est. Id genus sumptum in definitione, pars quaedam fit. Non enim solum speciem complet, nisi adjiciantur etiam differentiae, in quibus eadem ratio quae in genere est. Nam cum ipsae singulariter dictae totam speciem claudant, in definitione sumptae, partes speciei fiunt, quia non solum speciem quidem esse designant, sed etiam genus.</code> | <code>Dictum est, id quod in nomine confuse significaretur, in definitione quae fit enumeratione partium, aperiri atque explicari. Quod fieri non potest, nisi per quarumdam partium nuncupationem; nihil enim dum explicatur oratione, totum simul dici potest. Quae cum ita sint, cumque omnis hujusmodi definitio quaedam sit partium distributio, quatuor his modis fieri potest. Aut enim substantiales partes explicantur, aut proprietatis partes dicuntur, aut quasi totius membra enumerantur, aut tanquam species dividuntur. Substantiales partes explicantur, cum ex genere ac differentiis definitio constituitur. Genus enim quod singulariter praedicatur, speciei totum est. Id genus sumptum in definitione, pars quaedam fit. Non enim solum speciem complet, nisi adjiciantur etiam differentiae, in quibus eadem ratio quae in genere est. Nam cum ipsae singulariter dictae totam speciem claudant, in definitione sumptae, partes speciei fiunt, quia non solum speciem quidem esse designant, sed etiam genus.</code> | |
|
|
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: |
|
|
```json |
|
|
{ |
|
|
"scale": 20.0, |
|
|
"similarity_fct": "cos_sim", |
|
|
"gather_across_devices": false |
|
|
} |
|
|
``` |
|
|
|
|
|
### Training Hyperparameters |
|
|
#### Non-Default Hyperparameters |
|
|
|
|
|
- `per_device_train_batch_size`: 128 |
|
|
- `per_device_eval_batch_size`: 128 |
|
|
- `num_train_epochs`: 1 |
|
|
- `fp16`: True |
|
|
- `multi_dataset_batch_sampler`: round_robin |
|
|
|
|
|
#### All Hyperparameters |
|
|
<details><summary>Click to expand</summary> |
|
|
|
|
|
- `overwrite_output_dir`: False |
|
|
- `do_predict`: False |
|
|
- `eval_strategy`: no |
|
|
- `prediction_loss_only`: True |
|
|
- `per_device_train_batch_size`: 128 |
|
|
- `per_device_eval_batch_size`: 128 |
|
|
- `per_gpu_train_batch_size`: None |
|
|
- `per_gpu_eval_batch_size`: None |
|
|
- `gradient_accumulation_steps`: 1 |
|
|
- `eval_accumulation_steps`: None |
|
|
- `torch_empty_cache_steps`: None |
|
|
- `learning_rate`: 5e-05 |
|
|
- `weight_decay`: 0.0 |
|
|
- `adam_beta1`: 0.9 |
|
|
- `adam_beta2`: 0.999 |
|
|
- `adam_epsilon`: 1e-08 |
|
|
- `max_grad_norm`: 1 |
|
|
- `num_train_epochs`: 1 |
|
|
- `max_steps`: -1 |
|
|
- `lr_scheduler_type`: linear |
|
|
- `lr_scheduler_kwargs`: {} |
|
|
- `warmup_ratio`: 0.0 |
|
|
- `warmup_steps`: 0 |
|
|
- `log_level`: passive |
|
|
- `log_level_replica`: warning |
|
|
- `log_on_each_node`: True |
|
|
- `logging_nan_inf_filter`: True |
|
|
- `save_safetensors`: True |
|
|
- `save_on_each_node`: False |
|
|
- `save_only_model`: False |
|
|
- `restore_callback_states_from_checkpoint`: False |
|
|
- `no_cuda`: False |
|
|
- `use_cpu`: False |
|
|
- `use_mps_device`: False |
|
|
- `seed`: 42 |
|
|
- `data_seed`: None |
|
|
- `jit_mode_eval`: False |
|
|
- `use_ipex`: False |
|
|
- `bf16`: False |
|
|
- `fp16`: True |
|
|
- `fp16_opt_level`: O1 |
|
|
- `half_precision_backend`: auto |
|
|
- `bf16_full_eval`: False |
|
|
- `fp16_full_eval`: False |
|
|
- `tf32`: None |
|
|
- `local_rank`: 0 |
|
|
- `ddp_backend`: None |
|
|
- `tpu_num_cores`: None |
|
|
- `tpu_metrics_debug`: False |
|
|
- `debug`: [] |
|
|
- `dataloader_drop_last`: False |
|
|
- `dataloader_num_workers`: 0 |
|
|
- `dataloader_prefetch_factor`: None |
|
|
- `past_index`: -1 |
|
|
- `disable_tqdm`: False |
|
|
- `remove_unused_columns`: True |
|
|
- `label_names`: None |
|
|
- `load_best_model_at_end`: False |
|
|
- `ignore_data_skip`: False |
|
|
- `fsdp`: [] |
|
|
- `fsdp_min_num_params`: 0 |
|
|
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} |
|
|
- `fsdp_transformer_layer_cls_to_wrap`: None |
|
|
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} |
|
|
- `parallelism_config`: None |
|
|
- `deepspeed`: None |
|
|
- `label_smoothing_factor`: 0.0 |
|
|
- `optim`: adamw_torch_fused |
|
|
- `optim_args`: None |
|
|
- `adafactor`: False |
|
|
- `group_by_length`: False |
|
|
- `length_column_name`: length |
|
|
- `ddp_find_unused_parameters`: None |
|
|
- `ddp_bucket_cap_mb`: None |
|
|
- `ddp_broadcast_buffers`: False |
|
|
- `dataloader_pin_memory`: True |
|
|
- `dataloader_persistent_workers`: False |
|
|
- `skip_memory_metrics`: True |
|
|
- `use_legacy_prediction_loop`: False |
|
|
- `push_to_hub`: False |
|
|
- `resume_from_checkpoint`: None |
|
|
- `hub_model_id`: None |
|
|
- `hub_strategy`: every_save |
|
|
- `hub_private_repo`: None |
|
|
- `hub_always_push`: False |
|
|
- `hub_revision`: None |
|
|
- `gradient_checkpointing`: False |
|
|
- `gradient_checkpointing_kwargs`: None |
|
|
- `include_inputs_for_metrics`: False |
|
|
- `include_for_metrics`: [] |
|
|
- `eval_do_concat_batches`: True |
|
|
- `fp16_backend`: auto |
|
|
- `push_to_hub_model_id`: None |
|
|
- `push_to_hub_organization`: None |
|
|
- `mp_parameters`: |
|
|
- `auto_find_batch_size`: False |
|
|
- `full_determinism`: False |
|
|
- `torchdynamo`: None |
|
|
- `ray_scope`: last |
|
|
- `ddp_timeout`: 1800 |
|
|
- `torch_compile`: False |
|
|
- `torch_compile_backend`: None |
|
|
- `torch_compile_mode`: None |
|
|
- `include_tokens_per_second`: False |
|
|
- `include_num_input_tokens_seen`: False |
|
|
- `neftune_noise_alpha`: None |
|
|
- `optim_target_modules`: None |
|
|
- `batch_eval_metrics`: False |
|
|
- `eval_on_start`: False |
|
|
- `use_liger_kernel`: False |
|
|
- `liger_kernel_config`: None |
|
|
- `eval_use_gather_object`: False |
|
|
- `average_tokens_across_devices`: False |
|
|
- `prompts`: None |
|
|
- `batch_sampler`: batch_sampler |
|
|
- `multi_dataset_batch_sampler`: round_robin |
|
|
- `router_mapping`: {} |
|
|
- `learning_rate_mapping`: {} |
|
|
|
|
|
</details> |
|
|
|
|
|
### Training Logs |
|
|
| Epoch | Step | Training Loss | |
|
|
|:------:|:----:|:-------------:| |
|
|
| 0.6410 | 500 | 0.1311 | |
|
|
|
|
|
|
|
|
### Framework Versions |
|
|
- Python: 3.12.11 |
|
|
- Sentence Transformers: 5.1.0 |
|
|
- Transformers: 4.56.0 |
|
|
- PyTorch: 2.8.0+cu128 |
|
|
- Accelerate: 1.10.1 |
|
|
- Datasets: 4.0.0 |
|
|
- Tokenizers: 0.22.0 |
|
|
|
|
|
## Citation |
|
|
|
|
|
### BibTeX |
|
|
|
|
|
#### Sentence Transformers |
|
|
```bibtex |
|
|
@inproceedings{reimers-2019-sentence-bert, |
|
|
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", |
|
|
author = "Reimers, Nils and Gurevych, Iryna", |
|
|
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", |
|
|
month = "11", |
|
|
year = "2019", |
|
|
publisher = "Association for Computational Linguistics", |
|
|
url = "https://arxiv.org/abs/1908.10084", |
|
|
} |
|
|
``` |
|
|
|
|
|
#### MultipleNegativesRankingLoss |
|
|
```bibtex |
|
|
@misc{henderson2017efficient, |
|
|
title={Efficient Natural Language Response Suggestion for Smart Reply}, |
|
|
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, |
|
|
year={2017}, |
|
|
eprint={1705.00652}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CL} |
|
|
} |
|
|
``` |
|
|
|
|
|
<!-- |
|
|
## Glossary |
|
|
|
|
|
*Clearly define terms in order to be accessible across audiences.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
## Model Card Authors |
|
|
|
|
|
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
## Model Card Contact |
|
|
|
|
|
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.* |
|
|
--> |