answerdotai
/

JaColBERTv2.5

Sentence Similarity

Model card Files Files and versions

bclavie commited on Jul 29, 2024

Commit

24af020

·

verified ·

1 Parent(s): a2b1162

Create README.md

Files changed (1) hide show

README.md +25 -0

README.md ADDED Viewed

	@@ -0,0 +1,25 @@

+---
+inference: false
+datasets:
+- answerdotai/MMARCO-japanese-32-scored-triplets
+- miracl/miracl
+- hotchpotch/JQaRA
+- matsuxr/JaGovFaqs-22k
+- unicamp-dl/mmarco
+language:
+- ja
+pipeline_tag: sentence-similarity
+tags:
+- ColBERT
+base_model:
+- cl-tohoku/bert-base-japanese-v3
+- bclavie/JaColBERT
+license: mit
+library_name: RAGatouille
+---
+Model weights for the final JaColBERTv2.5 checkpoint, using an entirely overhauled training recipe and trained on just 40% of the data of JaColBERTv2.
+This model largely outperforms all previous approaches, including JaColBERTV2 multilingual models such as BGE-M3, on all datasets.
+This page will be updated with the full details and the model report in the next few days.