Sentence Similarity
sentence-transformers
Safetensors
English
mpnet
feature-extraction
Generated from Trainer
dataset_size:5579240
loss:CachedMultipleNegativesRankingLoss
text-embeddings-inference
Instructions to use TechWolf/JobBERT-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use TechWolf/JobBERT-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("TechWolf/JobBERT-v2") sentences = [ "Program Coordinator RN", "discuss the medical history of the healthcare user, evidence-based approach in general practice, apply various lifting techniques, establish daily priorities, manage time, demonstrate disciplinary expertise, tolerate sitting for long periods, think critically, provide professional care in nursing, attend meetings, represent union members, nursing science, manage a multidisciplinary team involved in patient care, implement nursing care, customer service, work under supervision in care, keep up-to-date with training subjects, evidence-based nursing care, operate lifting equipment, follow code of ethics for biomedical practices, coordinate care, provide learning support in healthcare", "provide written content, prepare visual data, design computer network, deliver visual presentation of data, communication, operate relational database management system, ICT communications protocols, document management, use threading techniques, search engines, computer science, analyse network bandwidth requirements, analyse network configuration and performance, develop architectural plans, conduct ICT code review, hardware architectures, computer engineering, video-games functionalities, conduct web searches, use databases, use online tools to collaborate", "nursing science, administer appointments, administrative tasks in a medical environment, intravenous infusion, plan nursing care, prepare intravenous packs, work with nursing staff, supervise nursing staff, clinical perfusion" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Inference
- Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -189,7 +189,7 @@ import torch
|
|
| 189 |
import numpy as np
|
| 190 |
from tqdm.auto import tqdm
|
| 191 |
from sentence_transformers import SentenceTransformer
|
| 192 |
-
from sentence_transformers.util import batch_to_device
|
| 193 |
|
| 194 |
# Load the model
|
| 195 |
model = SentenceTransformer("TechWolf/JobBERT-v2")
|
|
@@ -230,11 +230,26 @@ job_titles = [
|
|
| 230 |
# Get embeddings
|
| 231 |
embeddings = encode(model, job_titles)
|
| 232 |
|
| 233 |
-
# Calculate similarity matrix
|
| 234 |
-
similarities =
|
| 235 |
print(similarities)
|
| 236 |
```
|
| 237 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 238 |
### Example Use Cases
|
| 239 |
|
| 240 |
1. **Job Title Matching**: Find similar job titles for standardization or matching
|
|
|
|
| 189 |
import numpy as np
|
| 190 |
from tqdm.auto import tqdm
|
| 191 |
from sentence_transformers import SentenceTransformer
|
| 192 |
+
from sentence_transformers.util import batch_to_device, cos_sim
|
| 193 |
|
| 194 |
# Load the model
|
| 195 |
model = SentenceTransformer("TechWolf/JobBERT-v2")
|
|
|
|
| 230 |
# Get embeddings
|
| 231 |
embeddings = encode(model, job_titles)
|
| 232 |
|
| 233 |
+
# Calculate cosine similarity matrix
|
| 234 |
+
similarities = cos_sim(embeddings, embeddings)
|
| 235 |
print(similarities)
|
| 236 |
```
|
| 237 |
|
| 238 |
+
The output will be a similarity matrix where each value represents the cosine similarity between two job titles:
|
| 239 |
+
|
| 240 |
+
```
|
| 241 |
+
tensor([[1.0000, 0.8723, 0.4821, 0.5447],
|
| 242 |
+
[0.8723, 1.0000, 0.4822, 0.5019],
|
| 243 |
+
[0.4821, 0.4822, 1.0000, 0.4328],
|
| 244 |
+
[0.5447, 0.5019, 0.4328, 1.0000]])
|
| 245 |
+
```
|
| 246 |
+
|
| 247 |
+
In this example:
|
| 248 |
+
- The diagonal values are 1.0000 (perfect similarity with itself)
|
| 249 |
+
- 'Software Engineer' and 'Senior Software Developer' have high similarity (0.8723)
|
| 250 |
+
- 'Product Manager' and 'Data Scientist' show lower similarity with other roles
|
| 251 |
+
- All values range between 0 and 1, where higher values indicate greater similarity
|
| 252 |
+
|
| 253 |
### Example Use Cases
|
| 254 |
|
| 255 |
1. **Job Title Matching**: Find similar job titles for standardization or matching
|