Sigurdur
/

isl-sbert-s

Sentence Similarity

sentence-transformers

feature-extraction

text-embeddings-inference

Model card Files Files and versions

Sigurdur commited on Dec 24, 2023

Commit

a8d1e15

·

1 Parent(s): c682154

Update README.md

Files changed (1) hide show

README.md +12 -1

README.md CHANGED Viewed

@@ -8,11 +8,22 @@ tags:
 ---
-# {MODEL_NAME}
 This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
 <!--- Describe your model here -->
 ## Usage (Sentence-Transformers)

 ---
+# Icelandic SBERT for Sentence Embedding
 This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
 <!--- Describe your model here -->
+## Data
+from clarin-is: [unanotated news2 from IGC(RMH)](https://repository.clarin.is/repository/xmlui/handle/20.500.12537/238)
+I figured the most modern and common sentences would appear in the news, so I chose this dataset. For more sophisticated language the books dataset would be better.
+to install the data, run the following command:
+```bash
+curl --remote-name-all https://repository.clarin.is/repository/xmlui/bitstream/handle/20.500.12537/238{/IGC-News2-22.10.TEI.zip}
+```
 ## Usage (Sentence-Transformers)