Release GreenNode/GreenNode-Embedding-Large-VN-Mixed-V1
Browse files
README.md
CHANGED
|
@@ -1,9 +1,9 @@
|
|
| 1 |
-
|
| 2 |
---
|
| 3 |
datasets:
|
| 4 |
- GreenNode/GreenNode-Table-Markdown-Retrieval
|
| 5 |
language:
|
| 6 |
- vi
|
|
|
|
| 7 |
library_name: sentence-transformers
|
| 8 |
pipeline_tag: sentence-similarity
|
| 9 |
tags:
|
|
@@ -13,7 +13,7 @@ tags:
|
|
| 13 |
widget: []
|
| 14 |
metrics:
|
| 15 |
- InfoNCE
|
| 16 |
-
license:
|
| 17 |
---
|
| 18 |
|
| 19 |
# SentenceTransformer
|
|
@@ -24,13 +24,11 @@ This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps
|
|
| 24 |
|
| 25 |
### Model Description
|
| 26 |
- **Model Type:** Sentence Transformer
|
| 27 |
-
<!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
|
| 28 |
- **Maximum Sequence Length:** 8192 tokens
|
| 29 |
- **Output Dimensionality:** 1024 tokens
|
| 30 |
- **Similarity Function:** Cosine Similarity
|
| 31 |
-
- **Training Dataset:** - GreenNode/GreenNode-Table-Markdown-Retrieval
|
| 32 |
- **Language:** Vietnamese
|
| 33 |
-
- **License:** cc-by-4.0
|
| 34 |
|
| 35 |
### Model Sources
|
| 36 |
|
|
@@ -63,7 +61,7 @@ Then you can load this model and run inference.
|
|
| 63 |
from sentence_transformers import SentenceTransformer
|
| 64 |
|
| 65 |
# Download from the 🤗 Hub
|
| 66 |
-
model = SentenceTransformer("
|
| 67 |
# Run inference
|
| 68 |
sentences = [
|
| 69 |
'The weather is lovely today.',
|
|
@@ -80,43 +78,6 @@ print(similarities.shape)
|
|
| 80 |
# [3, 3]
|
| 81 |
```
|
| 82 |
|
| 83 |
-
<!--
|
| 84 |
-
### Direct Usage (Transformers)
|
| 85 |
-
|
| 86 |
-
<details><summary>Click to see the direct usage in Transformers</summary>
|
| 87 |
-
|
| 88 |
-
</details>
|
| 89 |
-
-->
|
| 90 |
-
|
| 91 |
-
<!--
|
| 92 |
-
### Downstream Usage (Sentence Transformers)
|
| 93 |
-
|
| 94 |
-
You can finetune this model on your own dataset.
|
| 95 |
-
|
| 96 |
-
<details><summary>Click to expand</summary>
|
| 97 |
-
|
| 98 |
-
</details>
|
| 99 |
-
-->
|
| 100 |
-
|
| 101 |
-
<!--
|
| 102 |
-
### Out-of-Scope Use
|
| 103 |
-
|
| 104 |
-
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
|
| 105 |
-
-->
|
| 106 |
-
|
| 107 |
-
<!--
|
| 108 |
-
## Bias, Risks and Limitations
|
| 109 |
-
|
| 110 |
-
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
|
| 111 |
-
-->
|
| 112 |
-
|
| 113 |
-
<!--
|
| 114 |
-
### Recommendations
|
| 115 |
-
|
| 116 |
-
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
|
| 117 |
-
-->
|
| 118 |
-
|
| 119 |
-
## Training Details
|
| 120 |
## Evaluation
|
| 121 |
### Table: Performance comparison of various models on GreenNodeTableRetrieval
|
| 122 |
Dataset: [GreenNode/GreenNode-Table-Markdown-Retrieval](https://huggingface.co/datasets/GreenNode/GreenNode-Table-Markdown-Retrieval-VN)
|
|
@@ -197,24 +158,10 @@ Dataset: [taidng/UIT-ViQuAD2.0](https://huggingface.co/datasets/taidng/UIT-ViQuA
|
|
| 197 |
- Datasets: 2.20.0
|
| 198 |
- Tokenizers: 0.19.1
|
| 199 |
|
| 200 |
-
##
|
| 201 |
-
|
| 202 |
-
### BibTeX
|
| 203 |
|
| 204 |
-
|
| 205 |
-
## Glossary
|
| 206 |
|
| 207 |
-
|
| 208 |
-
-->
|
| 209 |
-
|
| 210 |
-
<!--
|
| 211 |
-
## Model Card Authors
|
| 212 |
-
|
| 213 |
-
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
|
| 214 |
-
-->
|
| 215 |
-
|
| 216 |
-
<!--
|
| 217 |
-
## Model Card Contact
|
| 218 |
|
| 219 |
-
|
| 220 |
-
-->
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
datasets:
|
| 3 |
- GreenNode/GreenNode-Table-Markdown-Retrieval
|
| 4 |
language:
|
| 5 |
- vi
|
| 6 |
+
- en
|
| 7 |
library_name: sentence-transformers
|
| 8 |
pipeline_tag: sentence-similarity
|
| 9 |
tags:
|
|
|
|
| 13 |
widget: []
|
| 14 |
metrics:
|
| 15 |
- InfoNCE
|
| 16 |
+
license: mit
|
| 17 |
---
|
| 18 |
|
| 19 |
# SentenceTransformer
|
|
|
|
| 24 |
|
| 25 |
### Model Description
|
| 26 |
- **Model Type:** Sentence Transformer
|
|
|
|
| 27 |
- **Maximum Sequence Length:** 8192 tokens
|
| 28 |
- **Output Dimensionality:** 1024 tokens
|
| 29 |
- **Similarity Function:** Cosine Similarity
|
| 30 |
+
- **Training Dataset:** - [GreenNode/GreenNode-Table-Markdown-Retrieval](https://huggingface.co/datasets/GreenNode/GreenNode-Table-Markdown-Retrieval-VN)
|
| 31 |
- **Language:** Vietnamese
|
|
|
|
| 32 |
|
| 33 |
### Model Sources
|
| 34 |
|
|
|
|
| 61 |
from sentence_transformers import SentenceTransformer
|
| 62 |
|
| 63 |
# Download from the 🤗 Hub
|
| 64 |
+
model = SentenceTransformer("GreenNode/GreenNode-Embedding-Large-VN-Mixed-V1")
|
| 65 |
# Run inference
|
| 66 |
sentences = [
|
| 67 |
'The weather is lovely today.',
|
|
|
|
| 78 |
# [3, 3]
|
| 79 |
```
|
| 80 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
## Evaluation
|
| 82 |
### Table: Performance comparison of various models on GreenNodeTableRetrieval
|
| 83 |
Dataset: [GreenNode/GreenNode-Table-Markdown-Retrieval](https://huggingface.co/datasets/GreenNode/GreenNode-Table-Markdown-Retrieval-VN)
|
|
|
|
| 158 |
- Datasets: 2.20.0
|
| 159 |
- Tokenizers: 0.19.1
|
| 160 |
|
| 161 |
+
## License
|
|
|
|
|
|
|
| 162 |
|
| 163 |
+
This repository and the model weights are licensed under the [MIT License](LICENSE).
|
|
|
|
| 164 |
|
| 165 |
+
## Citation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 166 |
|
| 167 |
+
### BibTeX
|
|
|