KhoaUIT commited on
Commit
d375309
·
verified ·
1 Parent(s): d393f29

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +141 -139
README.md CHANGED
@@ -1,140 +1,142 @@
1
- ---
2
- tags:
3
- - sentence-transformers
4
- - sentence-similarity
5
- - feature-extraction
6
- pipeline_tag: sentence-similarity
7
- library_name: sentence-transformers
8
- ---
9
-
10
- # SentenceTransformer
11
-
12
- This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
13
-
14
- ## Model Details
15
-
16
- ### Model Description
17
- - **Model Type:** Sentence Transformer
18
- <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
19
- - **Maximum Sequence Length:** 512 tokens
20
- - **Output Dimensionality:** 768 tokens
21
- - **Similarity Function:** Cosine Similarity
22
- <!-- - **Training Dataset:** Unknown -->
23
- <!-- - **Language:** Unknown -->
24
- <!-- - **License:** Unknown -->
25
-
26
- ### Model Sources
27
-
28
- - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
29
- - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
30
- - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
31
-
32
- ### Full Model Architecture
33
-
34
- ```
35
- SentenceTransformer(
36
- (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
37
- (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
38
- )
39
- ```
40
-
41
- ## Usage
42
-
43
- ### Direct Usage (Sentence Transformers)
44
-
45
- First install the Sentence Transformers library:
46
-
47
- ```bash
48
- pip install -U sentence-transformers
49
- ```
50
-
51
- Then you can load this model and run inference.
52
- ```python
53
- from sentence_transformers import SentenceTransformer
54
-
55
- # Download from the 🤗 Hub
56
- model = SentenceTransformer("KhoaUIT/Halong-UIT-R2GQA")
57
- # Run inference
58
- sentences = [
59
- 'The weather is lovely today.',
60
- "It's so sunny outside!",
61
- 'He drove to the stadium.',
62
- ]
63
- embeddings = model.encode(sentences)
64
- print(embeddings.shape)
65
- # [3, 768]
66
-
67
- # Get the similarity scores for the embeddings
68
- similarities = model.similarity(embeddings, embeddings)
69
- print(similarities.shape)
70
- # [3, 3]
71
- ```
72
-
73
- <!--
74
- ### Direct Usage (Transformers)
75
-
76
- <details><summary>Click to see the direct usage in Transformers</summary>
77
-
78
- </details>
79
- -->
80
-
81
- <!--
82
- ### Downstream Usage (Sentence Transformers)
83
-
84
- You can finetune this model on your own dataset.
85
-
86
- <details><summary>Click to expand</summary>
87
-
88
- </details>
89
- -->
90
-
91
- <!--
92
- ### Out-of-Scope Use
93
-
94
- *List how the model may foreseeably be misused and address what users ought not to do with the model.*
95
- -->
96
-
97
- <!--
98
- ## Bias, Risks and Limitations
99
-
100
- *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
101
- -->
102
-
103
- <!--
104
- ### Recommendations
105
-
106
- *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
107
- -->
108
-
109
- ## Training Details
110
-
111
- ### Framework Versions
112
- - Python: 3.12.3
113
- - Sentence Transformers: 3.2.0
114
- - Transformers: 4.45.2
115
- - PyTorch: 2.3.0+cpu
116
- - Accelerate:
117
- - Datasets: 3.1.0
118
- - Tokenizers: 0.20.1
119
-
120
- ## Citation
121
-
122
- ### BibTeX
123
-
124
- <!--
125
- ## Glossary
126
-
127
- *Clearly define terms in order to be accessible across audiences.*
128
- -->
129
-
130
- <!--
131
- ## Model Card Authors
132
-
133
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
134
- -->
135
-
136
- <!--
137
- ## Model Card Contact
138
-
139
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
 
 
140
  -->
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ pipeline_tag: sentence-similarity
7
+ library_name: sentence-transformers
8
+ base_model:
9
+ - hiieu/halong_embedding
10
+ ---
11
+
12
+ # SentenceTransformer
13
+
14
+ This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
15
+
16
+ ## Model Details
17
+
18
+ ### Model Description
19
+ - **Model Type:** Sentence Transformer
20
+ <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
21
+ - **Maximum Sequence Length:** 512 tokens
22
+ - **Output Dimensionality:** 768 tokens
23
+ - **Similarity Function:** Cosine Similarity
24
+ <!-- - **Training Dataset:** Unknown -->
25
+ <!-- - **Language:** Unknown -->
26
+ <!-- - **License:** Unknown -->
27
+
28
+ ### Model Sources
29
+
30
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
31
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
32
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
33
+
34
+ ### Full Model Architecture
35
+
36
+ ```
37
+ SentenceTransformer(
38
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
39
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
40
+ )
41
+ ```
42
+
43
+ ## Usage
44
+
45
+ ### Direct Usage (Sentence Transformers)
46
+
47
+ First install the Sentence Transformers library:
48
+
49
+ ```bash
50
+ pip install -U sentence-transformers
51
+ ```
52
+
53
+ Then you can load this model and run inference.
54
+ ```python
55
+ from sentence_transformers import SentenceTransformer
56
+
57
+ # Download from the 🤗 Hub
58
+ model = SentenceTransformer("KhoaUIT/Halong-UIT-R2GQA")
59
+ # Run inference
60
+ sentences = [
61
+ 'The weather is lovely today.',
62
+ "It's so sunny outside!",
63
+ 'He drove to the stadium.',
64
+ ]
65
+ embeddings = model.encode(sentences)
66
+ print(embeddings.shape)
67
+ # [3, 768]
68
+
69
+ # Get the similarity scores for the embeddings
70
+ similarities = model.similarity(embeddings, embeddings)
71
+ print(similarities.shape)
72
+ # [3, 3]
73
+ ```
74
+
75
+ <!--
76
+ ### Direct Usage (Transformers)
77
+
78
+ <details><summary>Click to see the direct usage in Transformers</summary>
79
+
80
+ </details>
81
+ -->
82
+
83
+ <!--
84
+ ### Downstream Usage (Sentence Transformers)
85
+
86
+ You can finetune this model on your own dataset.
87
+
88
+ <details><summary>Click to expand</summary>
89
+
90
+ </details>
91
+ -->
92
+
93
+ <!--
94
+ ### Out-of-Scope Use
95
+
96
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
97
+ -->
98
+
99
+ <!--
100
+ ## Bias, Risks and Limitations
101
+
102
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
103
+ -->
104
+
105
+ <!--
106
+ ### Recommendations
107
+
108
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
109
+ -->
110
+
111
+ ## Training Details
112
+
113
+ ### Framework Versions
114
+ - Python: 3.12.3
115
+ - Sentence Transformers: 3.2.0
116
+ - Transformers: 4.45.2
117
+ - PyTorch: 2.3.0+cpu
118
+ - Accelerate:
119
+ - Datasets: 3.1.0
120
+ - Tokenizers: 0.20.1
121
+
122
+ ## Citation
123
+
124
+ ### BibTeX
125
+
126
+ <!--
127
+ ## Glossary
128
+
129
+ *Clearly define terms in order to be accessible across audiences.*
130
+ -->
131
+
132
+ <!--
133
+ ## Model Card Authors
134
+
135
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
136
+ -->
137
+
138
+ <!--
139
+ ## Model Card Contact
140
+
141
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
142
  -->