Update model card: add paper link, license, and update metadata

#2
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +39 -77
README.md CHANGED
@@ -1,49 +1,43 @@
1
  ---
2
  base_model: sentence-transformers/all-mpnet-base-v2
 
 
 
3
  tags:
4
  - sentence-transformers
5
- - sentence-similarity
6
  - feature-extraction
7
- pipeline_tag: sentence-similarity
8
- library_name: sentence-transformers
9
  ---
10
 
11
- # SentenceTransformer
 
 
12
 
13
- This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
14
 
15
  ## Model Details
16
 
17
  ### Model Description
18
  - **Model Type:** Sentence Transformer
19
- <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
20
  - **Maximum Sequence Length:** 512 tokens
21
  - **Output Dimensionality:** 768 dimensions
22
  - **Similarity Function:** Cosine Similarity
23
- <!-- - **Training Dataset:** Unknown -->
24
- <!-- - **Language:** Unknown -->
25
- <!-- - **License:** Unknown -->
26
 
27
  ### Model Sources
28
 
 
 
29
  - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
30
- - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
31
- - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
32
-
33
- ### Full Model Architecture
34
-
35
- ```
36
- SentenceTransformer(
37
- (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel
38
- (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
39
- )
40
- ```
41
 
42
  ## Usage
43
 
44
  ### Direct Usage (Sentence Transformers)
45
 
46
- First install the Sentence Transformers library:
47
 
48
  ```bash
49
  pip install -U sentence-transformers
@@ -55,11 +49,12 @@ from sentence_transformers import SentenceTransformer
55
 
56
  # Download from the 🤗 Hub
57
  model = SentenceTransformer("jensjorisdecorte/ConTeXT-Skill-Extraction-base")
 
58
  # Run inference
59
  sentences = [
60
- 'The weather is lovely today.',
61
- "It's so sunny outside!",
62
- 'He drove to the stadium.',
63
  ]
64
  embeddings = model.encode(sentences)
65
  print(embeddings.shape)
@@ -71,41 +66,14 @@ print(similarities.shape)
71
  # [3, 3]
72
  ```
73
 
74
- <!--
75
- ### Direct Usage (Transformers)
76
-
77
- <details><summary>Click to see the direct usage in Transformers</summary>
78
-
79
- </details>
80
- -->
81
-
82
- <!--
83
- ### Downstream Usage (Sentence Transformers)
84
-
85
- You can finetune this model on your own dataset.
86
-
87
- <details><summary>Click to expand</summary>
88
-
89
- </details>
90
- -->
91
 
92
- <!--
93
- ### Out-of-Scope Use
94
-
95
- *List how the model may foreseeably be misused and address what users ought not to do with the model.*
96
- -->
97
-
98
- <!--
99
- ## Bias, Risks and Limitations
100
-
101
- *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
102
- -->
103
-
104
- <!--
105
- ### Recommendations
106
-
107
- *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
108
- -->
109
 
110
  ## Training Details
111
 
@@ -120,22 +88,16 @@ You can finetune this model on your own dataset.
120
 
121
  ## Citation
122
 
123
- ### BibTeX
124
-
125
- <!--
126
- ## Glossary
127
-
128
- *Clearly define terms in order to be accessible across audiences.*
129
- -->
130
-
131
- <!--
132
- ## Model Card Authors
133
-
134
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
135
- -->
136
-
137
- <!--
138
- ## Model Card Contact
139
-
140
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
141
- -->
 
1
  ---
2
  base_model: sentence-transformers/all-mpnet-base-v2
3
+ library_name: sentence-transformers
4
+ pipeline_tag: text-retrieval
5
+ license: apache-2.0
6
  tags:
7
  - sentence-transformers
8
+ - text-retrieval
9
  - feature-extraction
10
+ - work-domain
11
+ - skill-extraction
12
  ---
13
 
14
+ # ConTeXT-Skill-Extraction-base
15
+
16
+ This is a [sentence-transformers](https://www.SBERT.net) model based on the `all-mpnet-base-v2` architecture. It is designed for work-domain AI tasks, specifically skill extraction and normalization, as part of the **WorkRB** (Work Research Benchmark) framework.
17
 
18
+ The model is presented in the paper [WorkRB: A Community-Driven Evaluation Framework for AI in the Work Domain](https://huggingface.co/papers/2604.13055).
19
 
20
  ## Model Details
21
 
22
  ### Model Description
23
  - **Model Type:** Sentence Transformer
24
+ - **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2)
25
  - **Maximum Sequence Length:** 512 tokens
26
  - **Output Dimensionality:** 768 dimensions
27
  - **Similarity Function:** Cosine Similarity
28
+ - **License:** Apache 2.0
 
 
29
 
30
  ### Model Sources
31
 
32
+ - **Paper:** [WorkRB: A Community-Driven Evaluation Framework for AI in the Work Domain](https://huggingface.co/papers/2604.13055)
33
+ - **Repository:** [WorkRB on GitHub](https://github.com/techwolf-ai/WorkRB)
34
  - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
 
 
 
 
 
 
 
 
 
 
 
35
 
36
  ## Usage
37
 
38
  ### Direct Usage (Sentence Transformers)
39
 
40
+ First, install the Sentence Transformers library:
41
 
42
  ```bash
43
  pip install -U sentence-transformers
 
49
 
50
  # Download from the 🤗 Hub
51
  model = SentenceTransformer("jensjorisdecorte/ConTeXT-Skill-Extraction-base")
52
+
53
  # Run inference
54
  sentences = [
55
+ 'Proficient in Python programming and machine learning.',
56
+ 'Experienced in project management and agile methodologies.',
57
+ 'Knowledge of cloud computing and AWS infrastructure.',
58
  ]
59
  embeddings = model.encode(sentences)
60
  print(embeddings.shape)
 
66
  # [3, 3]
67
  ```
68
 
69
+ ## Full Model Architecture
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70
 
71
+ ```
72
+ SentenceTransformer(
73
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel
74
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
75
+ )
76
+ ```
 
 
 
 
 
 
 
 
 
 
 
77
 
78
  ## Training Details
79
 
 
88
 
89
  ## Citation
90
 
91
+ If you find this model useful, please consider citing the following work:
92
+
93
+ ```bibtex
94
+ @misc{delange2025unifiedworkembeddings,
95
+ title={Unified Work Embeddings: Contrastive Learning of a Bidirectional Multi-task Ranker},
96
+ author={Matthias De Lange and Jens-Joris Decorte and Jeroen Van Hautte},
97
+ year={2025},
98
+ eprint={2511.07969},
99
+ archivePrefix={arXiv},
100
+ primaryClass={cs.CL},
101
+ url={https://arxiv.org/abs/2511.07969},
102
+ }
103
+ ```