Spaces:
Sleeping
Sleeping
| # Sentence Transformers | |
| This task lets you easily train or fine-tune a Sentence Transformer model on your own dataset. | |
| AutoTrain supports the following types of sentence transformer finetuning: | |
| - `pair`: dataset with two sentences: anchor and positive | |
| - `pair_class`: dataset with two sentences: premise and hypothesis and a target label | |
| - `pair_score`: dataset with two sentences: sentence1 and sentence2 and a target score | |
| - `triplet`: dataset with three sentences: anchor, positive and negative | |
| - `qa`: dataset with two sentences: query and answer | |
| ## Data Format | |
| Sentence Transformers finetuning accepts data in CSV/JSONL format. You can also use a dataset from Hugging Face Hub. | |
| ### `pair` | |
| For `pair` training, the data should be in the following format: | |
| | anchor | positive | | |
| |--------|----------| | |
| | hello | hi | | |
| | how are you | I am fine | | |
| | What is your name? | My name is Abhishek | | |
| | Which is the best programming language? | Python | | |
| ### `pair_class` | |
| For `pair_class` training, the data should be in the following format: | |
| | premise | hypothesis | label | | |
| |---------|------------|-------| | |
| | hello | hi | 1 | | |
| | how are you | I am fine | 0 | | |
| | What is your name? | My name is Abhishek | 1 | | |
| | Which is the best programming language? | Python | 1 | | |
| ### `pair_score` | |
| For `pair_score` training, the data should be in the following format: | |
| | sentence1 | sentence2 | score | | |
| |-----------|-----------|-------| | |
| | hello | hi | 0.8 | | |
| | how are you | I am fine | 0.2 | | |
| | What is your name? | My name is Abhishek | 0.9 | | |
| | Which is the best programming language? | Python | 0.7 | | |
| ### `triplet` | |
| For `triplet` training, the data should be in the following format: | |
| | anchor | positive | negative | | |
| |--------|----------|----------| | |
| | hello | hi | bye | | |
| | how are you | I am fine | I am not fine | | |
| | What is your name? | My name is Abhishek | Whats it to you? | | |
| | Which is the best programming language? | Python | Javascript | | |
| ### `qa` | |
| For `qa` training, the data should be in the following format: | |
| | query | answer | | |
| |-------|--------| | |
| | hello | hi | | |
| | how are you | I am fine | | |
| | What is your name? | My name is Abhishek | | |
| | Which is the best programming language? | Python | | |
| ## Parameters | |
| [[autodoc]] trainers.sent_transformers.params.SentenceTransformersParams | |