Fake Job Predictor

Data

  1. Data trained comes from this Kaggle repository: https://www.kaggle.com/datasets/shivamb/real-or-fake-fake-jobposting-prediction
  2. Original data size is around 18k samples. To avoid the class imbalacing problem, it was undersampled the majority class (true jobs).
  3. Final dataset used to train has a size of 4k sample.

Model

  1. Multi-head neural network. One head is used for each feature (description, requirements, and benefits of the job).
  2. Best metrics achieved (over validation data-split): Precision: 0.83, Recall: 0.65, F1-score: 0.71
  3. Code used for training comes from this GitHub repo: https://github.com/sebassaras02/AdvancedDLCourse/blob/master/02_transformers_nlp/bert.ipynb

Components:

Text Encoder: distilbert-base-uncased is used to encode the textual input into a dense vector.

Future work:

Train over larger datasets and with more computer resources

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sebastiansarasti/fakeJobs

Finetuned
(10777)
this model

Space using sebastiansarasti/fakeJobs 1