Sentence Similarity
sentence-transformers
Safetensors
bert
feature-extraction
Generated from Trainer
dataset_size:5000
loss:CosineSimilarityLoss
text-embeddings-inference
Instructions to use scr17/fyp with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use scr17/fyp with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("scr17/fyp") sentences = [ "looking Product Manager expertise AWS Cybersecurity JavaScript Cloud Architecture candidate responsible designing implementing maintaining solutions using modern technologies", "Emily Barry professional skilled JavaScript Machine Learning Kubernetes Computer Vision Experienced working multiple projects involving cloud technologies modern software development practices", "Stephen Baker professional skilled React AWS Node.js NLP Experienced working multiple projects involving cloud technologies modern software development practices", "James Jackson professional skilled Node.js Cybersecurity Kubernetes Docker Experienced working multiple projects involving cloud technologies modern software development practices" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
Upload fine-tuned model
Browse files- README.md +92 -135
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -8,124 +8,69 @@ tags:
|
|
| 8 |
- loss:CosineSimilarityLoss
|
| 9 |
base_model: sentence-transformers/all-MiniLM-L6-v2
|
| 10 |
widget:
|
| 11 |
-
- source_sentence:
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
buy.; Attorney later road drive high could new.; Public near program language..
|
| 15 |
-
Docker; Azure; Linux; JavaScript; React. SQL; DevOps; Linux; Python; TensorFlow;
|
| 16 |
-
C#
|
| 17 |
sentences:
|
| 18 |
-
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
- AI Researcher. Rather well administration police seat stand. Red produce yeah
|
| 30 |
-
site run fly purpose face.. Yourself address might expect his budget bill.; Later
|
| 31 |
-
top focus guess occur hour.; Have turn quickly help well its.; However research
|
| 32 |
-
visit.; Commercial building especially capital system each.. Python; Machine Learning;
|
| 33 |
-
Terraform; Deep Learning; Java; Cybersecurity; Linux. At exactly story letter
|
| 34 |
-
dream.; Paper experience control author like president girl education.; Education
|
| 35 |
-
fund hear side mother.; Who then more start various draw along.
|
| 36 |
-
- source_sentence: Full Stack Developer. We are looking for a Full Stack Developer
|
| 37 |
-
to join our growing team and work on exciting projects.. Threat store center scene
|
| 38 |
-
country can quite.; Campaign today degree.; Data when risk citizen common.; Current
|
| 39 |
-
few environment social about page.. Penetration Testing; Java; Node.js; Docker;
|
| 40 |
-
JavaScript. SQL; AWS; CI/CD; JavaScript; Machine Learning; C#
|
| 41 |
sentences:
|
| 42 |
-
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
- AI Researcher. Start ball civil set although. Or environmental place boy because
|
| 55 |
-
chance.. Within value ahead.; Class democratic candidate arm.; Region represent
|
| 56 |
-
great note nothing recently low.; Way live according follow walk doctor loss..
|
| 57 |
-
Azure; TensorFlow; JavaScript; Machine Learning; CI/CD; Kubernetes; Terraform.
|
| 58 |
-
Here other next over down seem yourself model.; Discover natural generation traditional
|
| 59 |
-
suddenly management.; Discuss food majority professor.
|
| 60 |
-
- source_sentence: AI Researcher. We are looking for a AI Researcher to join our growing
|
| 61 |
-
team and work on exciting projects.. Improve hard street ask anyone accept history.;
|
| 62 |
-
Heavy a through old nothing various.; Fight clearly safe available similar hot.;
|
| 63 |
-
Movie body accept society heavy six.; Note close bad detail cell.. NoSQL; Azure;
|
| 64 |
-
Terraform; Flask; Django. Deep Learning; NoSQL; Terraform; Python; CI/CD; Flask
|
| 65 |
sentences:
|
| 66 |
-
-
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
-
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
|
| 78 |
-
bill so technology. Thousand north particularly difficult. Check social into decade
|
| 79 |
-
thing minute ahead.. Send memory ago full director although morning.; Relationship
|
| 80 |
-
sign front actually forget personal cold name.; Near debate notice their.. SQL;
|
| 81 |
-
AWS; C#; Node.js; Cybersecurity; Machine Learning; React. Produce fly sea.; Middle
|
| 82 |
-
race risk.; Land foot often action brother dinner.; Sign administration use book
|
| 83 |
-
section memory tree.
|
| 84 |
-
- source_sentence: Machine Learning Engineer. We are looking for a Machine Learning
|
| 85 |
-
Engineer to join our growing team and work on exciting projects.. Talk serious
|
| 86 |
-
or mouth night measure.; Article ahead capital no development.; Do minute chance
|
| 87 |
-
employee.; Account impact product land never military main show.. Cybersecurity;
|
| 88 |
-
Terraform; Deep Learning; Python; Linux. Azure; Django; Docker; NoSQL; TypeScript;
|
| 89 |
-
SQL
|
| 90 |
sentences:
|
| 91 |
-
-
|
| 92 |
-
|
| 93 |
-
|
| 94 |
-
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
they not.. Reflect development forward hand.; Investment fall what guess.; Green
|
| 103 |
-
new instead language board.. Kubernetes; TypeScript; Django; TensorFlow; AWS;
|
| 104 |
-
C#; Deep Learning. Lay tax group message work statement ago.; Can try heart city.;
|
| 105 |
-
Positive social increase throw seat share standard.; Front far prepare.
|
| 106 |
-
- source_sentence: Software Engineer. We are looking for a Software Engineer to join
|
| 107 |
-
our growing team and work on exciting projects.. Suffer class note resource.;
|
| 108 |
-
Guess really character and right scientist behavior election.; Seat force cultural
|
| 109 |
-
arm while.; Single maintain from recently.; Not thing wife focus road.. CI/CD;
|
| 110 |
-
Terraform; DevOps; JavaScript; TypeScript. Docker; Java; Azure; Deep Learning;
|
| 111 |
-
AWS; Node.js
|
| 112 |
sentences:
|
| 113 |
-
-
|
| 114 |
-
|
| 115 |
-
|
| 116 |
-
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
participant sense.; Him station low happen available woman parent.; Measure recent
|
| 123 |
-
rock say city indeed allow value.
|
| 124 |
-
- Data Scientist. Standard defense clearly project.. Single always argue offer water
|
| 125 |
-
war.; Meeting certainly leader party heavy mind authority nearly.; Sister certain
|
| 126 |
-
any itself.; Paper top at area provide.. Cybersecurity; React; C#; TensorFlow;
|
| 127 |
-
Deep Learning; Penetration Testing; DevOps. Food safe wide key.; Word identify
|
| 128 |
-
cup life clear.
|
| 129 |
pipeline_tag: sentence-similarity
|
| 130 |
library_name: sentence-transformers
|
| 131 |
---
|
|
@@ -180,9 +125,9 @@ from sentence_transformers import SentenceTransformer
|
|
| 180 |
model = SentenceTransformer("sentence_transformers_model_id")
|
| 181 |
# Run inference
|
| 182 |
sentences = [
|
| 183 |
-
'
|
| 184 |
-
'
|
| 185 |
-
'
|
| 186 |
]
|
| 187 |
embeddings = model.encode(sentences)
|
| 188 |
print(embeddings.shape)
|
|
@@ -239,16 +184,16 @@ You can finetune this model on your own dataset.
|
|
| 239 |
* Size: 5,000 training samples
|
| 240 |
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
|
| 241 |
* Approximate statistics based on the first 1000 samples:
|
| 242 |
-
| | sentence_0
|
| 243 |
-
|:--------|:-----------------------------------------------------------------------------------
|
| 244 |
-
| type | string
|
| 245 |
-
| details | <ul><li>min:
|
| 246 |
* Samples:
|
| 247 |
-
| sentence_0
|
| 248 |
-
|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
| 249 |
-
| <code>
|
| 250 |
-
| <code>
|
| 251 |
-
| <code>
|
| 252 |
* Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
|
| 253 |
```json
|
| 254 |
{
|
|
@@ -261,7 +206,7 @@ You can finetune this model on your own dataset.
|
|
| 261 |
|
| 262 |
- `per_device_train_batch_size`: 16
|
| 263 |
- `per_device_eval_batch_size`: 16
|
| 264 |
-
- `num_train_epochs`:
|
| 265 |
- `multi_dataset_batch_sampler`: round_robin
|
| 266 |
|
| 267 |
#### All Hyperparameters
|
|
@@ -284,7 +229,7 @@ You can finetune this model on your own dataset.
|
|
| 284 |
- `adam_beta2`: 0.999
|
| 285 |
- `adam_epsilon`: 1e-08
|
| 286 |
- `max_grad_norm`: 1
|
| 287 |
-
- `num_train_epochs`:
|
| 288 |
- `max_steps`: -1
|
| 289 |
- `lr_scheduler_type`: linear
|
| 290 |
- `lr_scheduler_kwargs`: {}
|
|
@@ -385,14 +330,26 @@ You can finetune this model on your own dataset.
|
|
| 385 |
</details>
|
| 386 |
|
| 387 |
### Training Logs
|
| 388 |
-
| Epoch
|
| 389 |
-
|:------:|:----:|:-------------:|
|
| 390 |
-
| 1.5974
|
| 391 |
-
| 3.1949
|
| 392 |
-
| 4.7923
|
| 393 |
-
| 6.3898
|
| 394 |
-
| 7.9872
|
| 395 |
-
| 9.5847
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 396 |
|
| 397 |
|
| 398 |
### Framework Versions
|
|
|
|
| 8 |
- loss:CosineSimilarityLoss
|
| 9 |
base_model: sentence-transformers/all-MiniLM-L6-v2
|
| 10 |
widget:
|
| 11 |
+
- source_sentence: looking Product Manager expertise AWS Cybersecurity JavaScript
|
| 12 |
+
Cloud Architecture candidate responsible designing implementing maintaining solutions
|
| 13 |
+
using modern technologies
|
|
|
|
|
|
|
|
|
|
| 14 |
sentences:
|
| 15 |
+
- Emily Barry professional skilled JavaScript Machine Learning Kubernetes Computer
|
| 16 |
+
Vision Experienced working multiple projects involving cloud technologies modern
|
| 17 |
+
software development practices
|
| 18 |
+
- Stephen Baker professional skilled React AWS Node.js NLP Experienced working multiple
|
| 19 |
+
projects involving cloud technologies modern software development practices
|
| 20 |
+
- James Jackson professional skilled Node.js Cybersecurity Kubernetes Docker Experienced
|
| 21 |
+
working multiple projects involving cloud technologies modern software development
|
| 22 |
+
practices
|
| 23 |
+
- source_sentence: looking Software Engineer expertise AWS TensorFlow NLP Node.js
|
| 24 |
+
candidate responsible designing implementing maintaining solutions using modern
|
| 25 |
+
technologies
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
sentences:
|
| 27 |
+
- Jennifer Thompson professional skilled JavaScript TensorFlow Computer Vision Django
|
| 28 |
+
Experienced working multiple projects involving cloud technologies modern software
|
| 29 |
+
development practices
|
| 30 |
+
- Lisa Bell professional skilled Python TensorFlow Computer Vision Machine Learning
|
| 31 |
+
Experienced working multiple projects involving cloud technologies modern software
|
| 32 |
+
development practices
|
| 33 |
+
- Susan Rogers professional skilled Docker Cybersecurity Machine Learning Python
|
| 34 |
+
Experienced working multiple projects involving cloud technologies modern software
|
| 35 |
+
development practices
|
| 36 |
+
- source_sentence: looking DevOps Engineer expertise Cybersecurity Machine Learning
|
| 37 |
+
SQL TensorFlow candidate responsible designing implementing maintaining solutions
|
| 38 |
+
using modern technologies
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
sentences:
|
| 40 |
+
- Kenneth Jones professional skilled NLP Node.js Cybersecurity Cloud Architecture
|
| 41 |
+
Experienced working multiple projects involving cloud technologies modern software
|
| 42 |
+
development practices
|
| 43 |
+
- Matthew Mcintyre professional skilled NoSQL Kubernetes React Docker Experienced
|
| 44 |
+
working multiple projects involving cloud technologies modern software development
|
| 45 |
+
practices
|
| 46 |
+
- William Wilson professional skilled SQL Kubernetes CI/CD Security Analysis Experienced
|
| 47 |
+
working multiple projects involving cloud technologies modern software development
|
| 48 |
+
practices
|
| 49 |
+
- source_sentence: looking Software Engineer expertise Cybersecurity NLP SQL Django
|
| 50 |
+
candidate responsible designing implementing maintaining solutions using modern
|
| 51 |
+
technologies
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 52 |
sentences:
|
| 53 |
+
- Daniel Stewart professional skilled JavaScript Python Cybersecurity TensorFlow
|
| 54 |
+
Experienced working multiple projects involving cloud technologies modern software
|
| 55 |
+
development practices
|
| 56 |
+
- Kristy Massey MD professional skilled Django Security Analysis JavaScript Cybersecurity
|
| 57 |
+
Experienced working multiple projects involving cloud technologies modern software
|
| 58 |
+
development practices
|
| 59 |
+
- Melanie Sutton professional skilled Django CI/CD JavaScript SQL Experienced working
|
| 60 |
+
multiple projects involving cloud technologies modern software development practices
|
| 61 |
+
- source_sentence: looking AI Researcher expertise CI/CD Docker TensorFlow JavaScript
|
| 62 |
+
candidate responsible designing implementing maintaining solutions using modern
|
| 63 |
+
technologies
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 64 |
sentences:
|
| 65 |
+
- Dr. William Ramirez professional skilled NoSQL React CI/CD Cloud Architecture
|
| 66 |
+
Experienced working multiple projects involving cloud technologies modern software
|
| 67 |
+
development practices
|
| 68 |
+
- Rebecca Wiley professional skilled Python Kubernetes Node.js JavaScript Experienced
|
| 69 |
+
working multiple projects involving cloud technologies modern software development
|
| 70 |
+
practices
|
| 71 |
+
- Roberta Graham professional skilled Flask Machine Learning Node.js Docker Experienced
|
| 72 |
+
working multiple projects involving cloud technologies modern software development
|
| 73 |
+
practices
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
pipeline_tag: sentence-similarity
|
| 75 |
library_name: sentence-transformers
|
| 76 |
---
|
|
|
|
| 125 |
model = SentenceTransformer("sentence_transformers_model_id")
|
| 126 |
# Run inference
|
| 127 |
sentences = [
|
| 128 |
+
'looking AI Researcher expertise CI/CD Docker TensorFlow JavaScript candidate responsible designing implementing maintaining solutions using modern technologies',
|
| 129 |
+
'Roberta Graham professional skilled Flask Machine Learning Node.js Docker Experienced working multiple projects involving cloud technologies modern software development practices',
|
| 130 |
+
'Rebecca Wiley professional skilled Python Kubernetes Node.js JavaScript Experienced working multiple projects involving cloud technologies modern software development practices',
|
| 131 |
]
|
| 132 |
embeddings = model.encode(sentences)
|
| 133 |
print(embeddings.shape)
|
|
|
|
| 184 |
* Size: 5,000 training samples
|
| 185 |
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
|
| 186 |
* Approximate statistics based on the first 1000 samples:
|
| 187 |
+
| | sentence_0 | sentence_1 | label |
|
| 188 |
+
|:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 189 |
+
| type | string | string | float |
|
| 190 |
+
| details | <ul><li>min: 20 tokens</li><li>mean: 24.72 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 22 tokens</li><li>mean: 26.26 tokens</li><li>max: 34 tokens</li></ul> | <ul><li>min: 0.4</li><li>mean: 0.71</li><li>max: 1.0</li></ul> |
|
| 191 |
* Samples:
|
| 192 |
+
| sentence_0 | sentence_1 | label |
|
| 193 |
+
|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------|
|
| 194 |
+
| <code>looking AI Researcher expertise CI/CD Python Computer Vision Flask candidate responsible designing implementing maintaining solutions using modern technologies</code> | <code>Deanna Gibson professional skilled Security Analysis Node.js Machine Learning Kubernetes Experienced working multiple projects involving cloud technologies modern software development practices</code> | <code>0.481</code> |
|
| 195 |
+
| <code>looking Machine Learning Engineer expertise AWS Kubernetes Python Django candidate responsible designing implementing maintaining solutions using modern technologies</code> | <code>Amanda Johnson professional skilled AWS NLP Node.js Security Analysis Experienced working multiple projects involving cloud technologies modern software development practices</code> | <code>0.982</code> |
|
| 196 |
+
| <code>looking Cybersecurity Analyst expertise JavaScript Python Node.js NoSQL candidate responsible designing implementing maintaining solutions using modern technologies</code> | <code>Alicia Patton professional skilled Node.js TensorFlow SQL NoSQL Experienced working multiple projects involving cloud technologies modern software development practices</code> | <code>0.597</code> |
|
| 197 |
* Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
|
| 198 |
```json
|
| 199 |
{
|
|
|
|
| 206 |
|
| 207 |
- `per_device_train_batch_size`: 16
|
| 208 |
- `per_device_eval_batch_size`: 16
|
| 209 |
+
- `num_train_epochs`: 30
|
| 210 |
- `multi_dataset_batch_sampler`: round_robin
|
| 211 |
|
| 212 |
#### All Hyperparameters
|
|
|
|
| 229 |
- `adam_beta2`: 0.999
|
| 230 |
- `adam_epsilon`: 1e-08
|
| 231 |
- `max_grad_norm`: 1
|
| 232 |
+
- `num_train_epochs`: 30
|
| 233 |
- `max_steps`: -1
|
| 234 |
- `lr_scheduler_type`: linear
|
| 235 |
- `lr_scheduler_kwargs`: {}
|
|
|
|
| 330 |
</details>
|
| 331 |
|
| 332 |
### Training Logs
|
| 333 |
+
| Epoch | Step | Training Loss |
|
| 334 |
+
|:-------:|:----:|:-------------:|
|
| 335 |
+
| 1.5974 | 500 | 0.0324 |
|
| 336 |
+
| 3.1949 | 1000 | 0.0298 |
|
| 337 |
+
| 4.7923 | 1500 | 0.028 |
|
| 338 |
+
| 6.3898 | 2000 | 0.025 |
|
| 339 |
+
| 7.9872 | 2500 | 0.0229 |
|
| 340 |
+
| 9.5847 | 3000 | 0.0198 |
|
| 341 |
+
| 11.1821 | 3500 | 0.0179 |
|
| 342 |
+
| 12.7796 | 4000 | 0.0156 |
|
| 343 |
+
| 14.3770 | 4500 | 0.014 |
|
| 344 |
+
| 15.9744 | 5000 | 0.0127 |
|
| 345 |
+
| 17.5719 | 5500 | 0.0115 |
|
| 346 |
+
| 19.1693 | 6000 | 0.0104 |
|
| 347 |
+
| 20.7668 | 6500 | 0.0098 |
|
| 348 |
+
| 22.3642 | 7000 | 0.009 |
|
| 349 |
+
| 23.9617 | 7500 | 0.0086 |
|
| 350 |
+
| 25.5591 | 8000 | 0.0082 |
|
| 351 |
+
| 27.1565 | 8500 | 0.0078 |
|
| 352 |
+
| 28.7540 | 9000 | 0.0076 |
|
| 353 |
|
| 354 |
|
| 355 |
### Framework Versions
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 90864192
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:10155879dc23fd7de75a79da5c7769b7f20cbfee7e011e992069584d86ab926c
|
| 3 |
size 90864192
|