Update README.md
Browse files
README.md
CHANGED
|
@@ -11,7 +11,7 @@ The **miCSE** language model is trained for sentence similarity computation. Tra
|
|
| 11 |
|
| 12 |
|
| 13 |
# Intended Use
|
| 14 |
-
The model intended to be used for encoding sentences or short paragraphs. Given an input text, the model produces a vector embedding, which captures the semantics. The embedding can be used for numerous tasks, e.g., retrieval
|
| 15 |
|
| 16 |
|
| 17 |
# Model Usage
|
|
@@ -64,7 +64,7 @@ print(f"Distance: {cos_sim[0,1].detach().item()}")
|
|
| 64 |
|
| 65 |
# Training data
|
| 66 |
|
| 67 |
-
The model was trained on a random collection of sentences from Wikipedia: [Training data file](https://huggingface.co/datasets/princeton-nlp/datasets-for-simcse/resolve/main/wiki1m_for_simcse.txt)
|
| 68 |
|
| 69 |
# Benchmark
|
| 70 |
|
|
|
|
| 11 |
|
| 12 |
|
| 13 |
# Intended Use
|
| 14 |
+
The model intended to be used for encoding sentences or short paragraphs. Given an input text, the model produces a vector embedding, which captures the semantics. The embedding can be used for numerous tasks, e.g., **retrieval**, **clustering** or **sentence similarity** comparison (see example below).
|
| 15 |
|
| 16 |
|
| 17 |
# Model Usage
|
|
|
|
| 64 |
|
| 65 |
# Training data
|
| 66 |
|
| 67 |
+
The model was trained on a random collection of **English** sentences from Wikipedia: [Training data file](https://huggingface.co/datasets/princeton-nlp/datasets-for-simcse/resolve/main/wiki1m_for_simcse.txt)
|
| 68 |
|
| 69 |
# Benchmark
|
| 70 |
|