Bingsu commited on
Commit
e9a9cb9
Β·
1 Parent(s): 3ed7855

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -6
README.md CHANGED
@@ -5,14 +5,27 @@ tags:
5
  - feature-extraction
6
  - sentence-similarity
7
  - transformers
8
-
 
 
 
 
 
 
 
 
 
9
  ---
10
 
11
  # smartmind/roberta-ko-small-tsdae
12
 
13
  This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 256 dimensional dense vector space and can be used for tasks like clustering or semantic search.
14
 
15
- <!--- Describe your model here -->
 
 
 
 
16
 
17
  ## Usage (Sentence-Transformers)
18
 
@@ -72,16 +85,18 @@ print(sentence_embeddings)
72
 
73
  ## Evaluation Results
74
 
75
- <!--- Describe how your model was evaluated -->
76
-
77
- For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=smartmind/roberta-ko-small-tsdae)
78
 
 
 
 
 
79
 
80
 
81
  ## Full Model Architecture
82
  ```
83
  SentenceTransformer(
84
- (0): Transformer({'max_seq_length': 508, 'do_lower_case': False}) with Transformer model: RobertaModel
85
  (1): Pooling({'word_embedding_dimension': 256, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
86
  )
87
  ```
 
5
  - feature-extraction
6
  - sentence-similarity
7
  - transformers
8
+ language:
9
+ - ko
10
+ license:
11
+ - mit
12
+ widget:
13
+ source_sentence: "λŒ€ν•œλ―Όκ΅­μ˜ μˆ˜λ„λŠ” μ„œμšΈμž…λ‹ˆλ‹€."
14
+ sentences:
15
+ - "미ꡭ의 μˆ˜λ„λŠ” λ‰΄μš•μ΄ μ•„λ‹™λ‹ˆλ‹€."
16
+ - "λŒ€ν•œλ―Όκ΅­μ˜ μˆ˜λ„ μš”κΈˆμ€ μ €λ ΄ν•œ νŽΈμž…λ‹ˆλ‹€."
17
+ - "μ„œμšΈμ€ λŒ€ν•œλ―Όκ΅­μ˜ μˆ˜λ„μž…λ‹ˆλ‹€."
18
  ---
19
 
20
  # smartmind/roberta-ko-small-tsdae
21
 
22
  This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 256 dimensional dense vector space and can be used for tasks like clustering or semantic search.
23
 
24
+ Korean roberta small model pretrained with [TSDAE](https://arxiv.org/abs/2104.06979).
25
+
26
+ [TSDAE](https://arxiv.org/abs/2104.06979)둜 μ‚¬μ „ν•™μŠ΅λœ ν•œκ΅­μ–΄ robertaλͺ¨λΈμž…λ‹ˆλ‹€. λͺ¨λΈμ˜ κ΅¬μ‘°λŠ” [lassl/roberta-ko-small](https://huggingface.co/lassl/roberta-ko-small)κ³Ό λ™μΌν•©λ‹ˆλ‹€. ν† ν¬λ‚˜μ΄μ €λŠ” λ‹€λ¦…λ‹ˆλ‹€.
27
+
28
+ sentence-similarityλ₯Ό κ΅¬ν•˜λŠ” μš©λ„λ‘œ λ°”λ‘œ μ‚¬μš©ν•  μˆ˜λ„ 있고, λͺ©μ μ— 맞게 νŒŒμΈνŠœλ‹ν•˜μ—¬ μ‚¬μš©ν•  μˆ˜λ„ μžˆμŠ΅λ‹ˆλ‹€.
29
 
30
  ## Usage (Sentence-Transformers)
31
 
 
85
 
86
  ## Evaluation Results
87
 
88
+ [klue](https://huggingface.co/datasets/klue) STS 데이터에 λŒ€ν•΄ λ‹€μŒ 점수λ₯Ό μ–»μ—ˆμŠ΅λ‹ˆλ‹€. 이 데이터에 λŒ€ν•΄ νŒŒμΈνŠœλ‹ν•˜μ§€ **μ•Šμ€** μƒνƒœλ‘œ κ΅¬ν•œ μ μˆ˜μž…λ‹ˆλ‹€.
 
 
89
 
90
+ |split|cosine_pearson|cosine_spearman|euclidean_pearson|euclidean_spearman|manhattan_pearson|manhattan_spearman|dot_pearson|dot_spearman|
91
+ |-----|--------------|---------------|-----------------|------------------|-----------------|------------------|-----------|------------|
92
+ |train|0.8735|0.8676|0.8268|0.8357|0.8248|0.8336|0.8449|0.8383|
93
+ |validation|0.5409|0.5349|0.4786|0.4657|0.4775|0.4625|0.5284|0.5252|
94
 
95
 
96
  ## Full Model Architecture
97
  ```
98
  SentenceTransformer(
99
+ (0): Transformer({'max_seq_length': 508, 'do_lower_case': False}) with Transformer model: RobertaModel
100
  (1): Pooling({'word_embedding_dimension': 256, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
101
  )
102
  ```