ihk
/

ojobert

@@ -7,10 +7,51 @@ license: mit
 language:
 - en
 widget:
-- text: "You must be proficient in [MASK]."
-- text: "Would you like to join a major manufacturing [MASK]?"
 ---
 _Nesta, the UK's innovation agency, has been scraping online job adverts since 2021 and building algorithms to extract and structure information as part of the [Open Jobs Observatory](https://www.nesta.org.uk/project/open-jobs-observatory/) project._
-_Although we are unable to share the raw data openly, we aim to open source **our models, algorithms and tools** so that anyone can use them for their own research and analysis._

 language:
 - en
 widget:
+- text: Would you like to join a major [MASK] company?
+tags:
+- jobs
 ---
 _Nesta, the UK's innovation agency, has been scraping online job adverts since 2021 and building algorithms to extract and structure information as part of the [Open Jobs Observatory](https://www.nesta.org.uk/project/open-jobs-observatory/) project._
+_Although we are unable to share the raw data openly, we aim to open source **our models, algorithms and tools** so that anyone can use them for their own research and analysis._
+This model is pre-trained from a `distilbert-base-uncased` checkpoint on 100k sentences from scraped online job postings as part of the Open Jobs Observatory.
+🖨️ Use
+To use the model:
+```
+from transformers import pipeline
+model = pipeline('fill-mask', model='ihk/ojobert', tokenizer='ihk/ojobert')
+```
+An example use is as follows:
+text = "Would you like to join a major [MASK] company?"
+model(text, top_k=3)
+>> [{'score': 0.1886572688817978,
+  'token': 13859,
+  'token_str': 'pharmaceutical',
+  'sequence': 'would you like to join a major pharmaceutical company?'},
+ {'score': 0.07436735928058624,
+  'token': 5427,
+  'token_str': 'insurance',
+  'sequence': 'would you like to join a major insurance company?'},
+ {'score': 0.06400047987699509,
+  'token': 2810,
+  'token_str': 'construction',
+  'sequence': 'would you like to join a major construction company?'}]
+⚖️ Training results
+The fine-tuning metrics are as follows:
+- eval_loss: 2.5871026515960693
+- eval_runtime: 134.4452
+- eval_samples_per_second: 14.281
+- eval_steps_per_second: 0.223
+- epoch: 3.0
+- perplexity: 13.29