gateswang00 commited on
Commit
b96f51a
·
verified ·
1 Parent(s): 1f7ddde

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +50 -1
  2. category.pkl +3 -0
README.md CHANGED
@@ -1,3 +1,52 @@
1
  ---
2
- license: mit
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ library_name: scikit-learn
3
+ tags:
4
+ - jobs
5
+ - classification
6
+ - tf-idf
7
+ pipeline_tag: text-classification
8
  ---
9
+
10
+ # Jobs job-category classifier (sklearn)
11
+
12
+ This repo holds the trained artifact used by the **jobs-ai-shared** package
13
+ (`jobs_ai_shared.library.categorize`) for the findjobs taxonomy.
14
+
15
+ - **Weights:** `category.pkl` — pickled scikit-learn `Pipeline` (`TfidfVectorizer` +
16
+ `LogisticRegression`), plus artifact keys `fields` and `input_joiner`.
17
+ - **Upstream:** Produced by ``scripts/train/train_category.py``.
18
+ - **HF repo:** `gateswang00/job_classifier`
19
+
20
+ ### Load locally
21
+
22
+ ```python
23
+ import pickle
24
+
25
+ from huggingface_hub import hf_hub_download
26
+
27
+ path = hf_hub_download(repo_id="gateswang00/job_classifier", filename="category.pkl")
28
+ with open(path, "rb") as f:
29
+ artifact = pickle.load(f)
30
+ clf = artifact["model"] # sklearn Pipeline
31
+ fields = artifact.get("fields", ["title", "llm_skills", "description"])
32
+ print(fields)
33
+ ```
34
+
35
+ ### Training metadata snapshot
36
+
37
+ ```
38
+ categorizer_filter: ['qwen2.5:7b', 'qwen3-jobs-classifier']
39
+ categorizer_mix: {'qwen2.5:7b': 8086}
40
+ category_source_filter: "category_source IS DISTINCT FROM 'rules'"
41
+ category_source_mix: {'(null)': 8167}
42
+ llm_skills_coverage: 0.7056641108088053
43
+ min_per_class: 50
44
+ n_classes: 14
45
+ n_rows: 8086
46
+ random_state: 42
47
+ source: 'jobs.job_categorized JOIN jobs.jobs_found LEFT JOIN LATERAL jobs.job_extracted'
48
+ test_size: 0.2
49
+ trained_at: '2026-05-23T05:57:02.189546+00:00'
50
+ ```
51
+
52
+ Replace this README’s license/frontmatter via the Hugging Face model card UI if needed.
category.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8e493e3c74f0a924e0011057c095eaa482e4f80265c63108603dcc90b9d5cf9c
3
+ size 9371220