File size: 1,686 Bytes
1f7ddde
b96f51a
 
 
 
 
 
1f7ddde
b96f51a
 
 
cedc354
 
b96f51a
cedc354
 
 
 
 
b96f51a
cedc354
b96f51a
cedc354
 
 
b96f51a
cedc354
 
 
 
 
 
b96f51a
cedc354
b96f51a
cedc354
 
b96f51a
 
 
 
 
 
 
 
cedc354
b96f51a
4a19755
cedc354
b96f51a
cedc354
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
library_name: scikit-learn
tags:
  - jobs
  - classification
  - tf-idf
pipeline_tag: text-classification
---

# Jobs job-category classifier (sklearn)

      This repo holds the trained artifact consumed by **jobs-shared**
      (`jobs_shared.ai.categorizers.pipeline`) for the findjobs taxonomy.

      - **Weights:** `category.joblib``joblib`-serialized scikit-learn `Pipeline`
        (`TfidfVectorizer` + `LogisticRegression`), plus artifact keys `fields`
        and `input_joiner`. Compressed with `joblib.dump(..., compress=3)`.
      - **Upstream:** Produced by ``scripts/train/train_category.py``.
      - **HF repo:** `gateswang00/job_classifier`

      ### Load locally

      ```python
      import joblib
      from huggingface_hub import hf_hub_download

      path = hf_hub_download(repo_id="gateswang00/job_classifier", filename="category.joblib")
      artifact = joblib.load(path)
      clf = artifact["model"]  # sklearn Pipeline
      fields = artifact.get("fields", ["title", "llm_skills", "description"])
      print(fields)
      ```

      ### Training metadata snapshot

      ```
        categorizer_filter: ['qwen2.5:7b', 'qwen3-jobs-classifier']
categorizer_mix: {'qwen2.5:7b': 8086}
category_source_filter: "category_source IS DISTINCT FROM 'rules'"
category_source_mix: {'(null)': 8167}
llm_skills_coverage: 0.7056641108088053
min_per_class: 50
n_classes: 14
n_rows: 8086
random_state: 42
source: 'jobs.job_categorized JOIN jobs.jobs_found LEFT JOIN LATERAL jobs.job_extracted'
test_size: 0.2
trained_at: '2026-05-23T15:57:36.400167+00:00'
      ```

      Replace this README’s license/frontmatter via the Hugging Face model card UI if needed.