Update README.md
Browse files
README.md
CHANGED
|
@@ -8,20 +8,20 @@ tags:
|
|
| 8 |
language:
|
| 9 |
- de
|
| 10 |
---
|
| 11 |
-
# {
|
| 12 |
-
Base-Model
|
| 13 |
|
| 14 |
-
Fine-Tuning
|
| 15 |
|
| 16 |
-
Training data
|
| 17 |
* both aws und deepl machine translation are used
|
| 18 |
* Training on sts-train, sts-dev
|
| 19 |
|
| 20 |
-
Evaluation data
|
| 21 |
|
| 22 |
-
Infrastructure
|
| 23 |
|
| 24 |
-
Hyperparameter
|
| 25 |
* batch size 64
|
| 26 |
* epochs 4
|
| 27 |
* MultiNegativeRankingLoss
|
|
@@ -72,8 +72,8 @@ def mean_pooling(model_output, attention_mask):
|
|
| 72 |
sentences = ['This is an example sentence', 'Each sentence is converted']
|
| 73 |
|
| 74 |
# Load model from HuggingFace Hub
|
| 75 |
-
tokenizer = AutoTokenizer.from_pretrained('{
|
| 76 |
-
model = AutoModel.from_pretrained('{
|
| 77 |
|
| 78 |
# Tokenize sentences
|
| 79 |
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
|
|
@@ -95,7 +95,7 @@ print(sentence_embeddings)
|
|
| 95 |
|
| 96 |
<!--- Describe how your model was evaluated -->
|
| 97 |
|
| 98 |
-
For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name={
|
| 99 |
|
| 100 |
|
| 101 |
## Training
|
|
|
|
| 8 |
language:
|
| 9 |
- de
|
| 10 |
---
|
| 11 |
+
# {Overview}
|
| 12 |
+
**Base-Model:** gbert-base
|
| 13 |
|
| 14 |
+
**Fine-Tuning:** sentence-transformer
|
| 15 |
|
| 16 |
+
**Training data:** german sts-dataset (can be found [here](https://github.com/t-systems-on-site-services-gmbh/german-STSbenchmark))
|
| 17 |
* both aws und deepl machine translation are used
|
| 18 |
* Training on sts-train, sts-dev
|
| 19 |
|
| 20 |
+
**Evaluation data:** german sts-dataset (sts-test)
|
| 21 |
|
| 22 |
+
**Infrastructure:** GPU V100 (20GB)
|
| 23 |
|
| 24 |
+
**Hyperparameter:**
|
| 25 |
* batch size 64
|
| 26 |
* epochs 4
|
| 27 |
* MultiNegativeRankingLoss
|
|
|
|
| 72 |
sentences = ['This is an example sentence', 'Each sentence is converted']
|
| 73 |
|
| 74 |
# Load model from HuggingFace Hub
|
| 75 |
+
tokenizer = AutoTokenizer.from_pretrained('{JoBeer/german-semantic-base}')
|
| 76 |
+
model = AutoModel.from_pretrained('{JoBeer/german-semantic-base}')
|
| 77 |
|
| 78 |
# Tokenize sentences
|
| 79 |
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
|
|
|
|
| 95 |
|
| 96 |
<!--- Describe how your model was evaluated -->
|
| 97 |
|
| 98 |
+
For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name={JoBeer/german-semantic-base})
|
| 99 |
|
| 100 |
|
| 101 |
## Training
|