Upload folder using huggingface_hub

Files changed (6) hide show

README.md ADDED Viewed

+---
+language: en
+tags:
+  - job-classification
+  - salary-prediction
+  - experience-prediction
+  - deberta
+---
+# JobPredictor1
+Fine-tuned DeBERTa-v3-base model that predicts:
+- **Expected years of experience** required for a job
+- **Expected salary** (USD)
+## Input Format
+```
+[LOCATION] <Remote | United States (State) | Country> [TITLE]: <job title> [DESC]: <job description>
+```
+## Outputs
+| Output | Type | Description |
+|---|---|---|
+| expected_experience_years | int | Years of experience required |
+| expected_salary | int | Expected salary (USD) |
+## Normalization
+Experience is z-score normalized:
+```python
+real_value = pred * norm_stats["expected_experience_years"]["std"] + norm_stats["expected_experience_years"]["mean"]
+```
+Salary is log1p + z-score normalized:
+```python
+real_salary = np.expm1(pred * norm_stats["expected_salary"]["std"] + norm_stats["expected_salary"]["mean"])
+```
+## Test Set Performance
+| Metric | Value |
+|---|---|
+| Experience MAE | 0.57 years |
+| Experience Within 1yr | 83.1% |
+| Salary MAE | $15,511 |
+| Salary Within $20k | 84.5% |
+## Base Model
+microsoft/deberta-v3-base

config.json ADDED Viewed

+{
+  "base_model": "microsoft/deberta-v3-base",
+  "architecture": "DeBERTa-v3-base + 2 regression heads",
+  "outputs": {
+    "expected_experience_years": "integer (years of experience)",
+    "expected_salary": "integer (expected salary USD)"
+  },
+  "normalization": {
+    "expected_experience_years": "z-score \u2014 use norm_stats.json to denormalize",
+    "expected_salary": "log1p then z-score \u2014 reverse with: expm1(pred * std + mean)"
+  },
+  "max_length": 512,
+  "dropout": 0.2
+}

norm_stats.json ADDED Viewed

+{
+  "expected_experience_years": {
+    "mean": 2.9545610445835053,
+    "std": 2.8019970286307627
+  },
+  "expected_salary": {
+    "mean": 11.046510169065424,
+    "std": 0.9250129804606931
+  }
+}

pytorch_model.bin ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:930177670bfead06a4bfaf12bf756bb27c61301ce20079c8af01905cddbf3bee
+size 736994139

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

+{
+  "add_prefix_space": true,
+  "backend": "tokenizers",
+  "bos_token": "[CLS]",
+  "cls_token": "[CLS]",
+  "do_lower_case": false,
+  "eos_token": "[SEP]",
+  "extra_special_tokens": [
+    "[PAD]",
+    "[CLS]",
+    "[SEP]"
+  ],
+  "is_local": false,
+  "mask_token": "[MASK]",
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "split_by_punct": false,
+  "tokenizer_class": "DebertaV2Tokenizer",
+  "unk_id": 3,
+  "unk_token": "[UNK]",
+  "vocab_type": "spm"
+}