demdecuong commited on
Commit
f4e85a4
·
1 Parent(s): 610888d

update result

Browse files
Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -1,5 +1,5 @@
1
  This is finetune version of [SimCSE: Simple Contrastive Learning of Sentence Embeddings](https://arxiv.org/abs/2104.08821)
2
- , train unsupervised on 100K triplet samples samples related to stroke from : stroke books, quora medical, quora's stroke and human annotates.
3
 
4
  ### Extract sentence representation
5
  ```
@@ -32,9 +32,13 @@ print(embedding.shape)
32
 
33
  ### Result
34
  On our company's PoC project, the testset contains positive/negative pairs of matching question related to stroke from human-generation.
 
 
 
35
 
36
  | Model | Top-1 Accuracy |
37
  | ------------- | ------------- |
38
- | SimCSE (supervised) | 75.83 |
39
  | SimCSE unsupervised (ours) | 76.66 |
40
- | SimCSE supervised (ours) | 73.33 |
 
 
1
  This is finetune version of [SimCSE: Simple Contrastive Learning of Sentence Embeddings](https://arxiv.org/abs/2104.08821)
2
+ , train unsupervised on 100K triplet samples samples related to stroke domain from : stroke books, quora medical, quora's stroke and human annotates. Positive sentences are generated by paraphrasing and back-translate. Negative sentences are randomly selected in general domain.
3
 
4
  ### Extract sentence representation
5
  ```
 
32
 
33
  ### Result
34
  On our company's PoC project, the testset contains positive/negative pairs of matching question related to stroke from human-generation.
35
+ - SimCSE supervised + 100k : Train on 100K triplet samples contains : medical, stroke and general domain
36
+ - SimCSE supervised + 42k : Train on 42K triplet samples contains : medical, stroke domain
37
+
38
 
39
  | Model | Top-1 Accuracy |
40
  | ------------- | ------------- |
41
+ | SimCSE supervised (author) | 75.83 |
42
  | SimCSE unsupervised (ours) | 76.66 |
43
+ | SimCSE supervised + 100k (ours) | 73.33 |
44
+ | SimCSE supervised + 42k (ours) | 75.83 |