mattohan commited on
Commit
96c6093
Β·
verified Β·
1 Parent(s): ed703c3

Update README with accurate model details

Browse files
Files changed (1) hide show
  1. README.md +38 -28
README.md CHANGED
@@ -1,11 +1,12 @@
1
  ---
2
- library_name: onnx
3
  tags:
4
  - email-classification
5
  - job-search
6
  - onnx
7
  - transformers.js
8
  - browser-ml
 
9
  license: mit
10
  datasets:
11
  - custom
@@ -16,52 +17,61 @@ pipeline_tag: text-classification
16
 
17
  # Email Classifier for Job Applications
18
 
19
- A fine-tuned BGE-small model that classifies emails into job application categories. Designed to run entirely in the browser using ONNX Runtime Web.
20
 
21
  ## Model Description
22
 
23
- - **Base Model:** BAAI/bge-small-en-v1.5
24
  - **Task:** 5-class email classification
25
- - **Format:** ONNX (optimized for browser inference)
26
- - **Size:** ~128MB
27
 
28
  ## Labels
29
 
30
- | Label | Description | Application Status |
31
- |-------|-------------|-------------------|
32
- | `confirmation` | Application received/confirmed | Applied |
33
- | `rejection` | Application rejected | Rejected |
34
- | `interview` | Interview invitation | Interviewing |
35
- | `offer` | Job offer | Offer |
36
- | `not_job` | Not job-related | - |
37
 
38
- ## Performance
39
 
40
- - **Validation Accuracy:** 99.65%
41
- - **Training Data:** 28,500 synthetic + curated emails
 
 
 
 
42
 
43
- ## Usage with ONNX Runtime Web
44
 
45
  ```javascript
46
- import * as ort from 'onnxruntime-web';
47
 
48
- // Load model
49
- const session = await ort.InferenceSession.create(
50
- 'https://huggingface.co/YOUR_USERNAME/email-classifier/resolve/main/model.onnx'
 
51
  );
52
 
53
- // Tokenize and run inference
54
- const results = await session.run({
55
- input_ids: inputIdsTensor,
56
- attention_mask: attentionMaskTensor,
57
- });
58
  ```
59
 
60
  ## Files
61
 
62
- - `model.onnx` - The ONNX model file
63
- - `vocab.txt` - Vocabulary file for tokenization
64
- - `config.json` - Model configuration
 
 
 
 
 
 
 
65
 
66
  ## Privacy
67
 
 
1
  ---
2
+ library_name: transformers.js
3
  tags:
4
  - email-classification
5
  - job-search
6
  - onnx
7
  - transformers.js
8
  - browser-ml
9
+ - bert
10
  license: mit
11
  datasets:
12
  - custom
 
17
 
18
  # Email Classifier for Job Applications
19
 
20
+ A fine-tuned BGE-small model that classifies emails into job application categories. Designed to run entirely in the browser using Transformers.js.
21
 
22
  ## Model Description
23
 
24
+ - **Base Model:** BAAI/bge-small-en-v1.5 (33M parameters)
25
  - **Task:** 5-class email classification
26
+ - **Format:** ONNX (opset 14, IR version 7)
27
+ - **Size:** 32.5 MB (quantized) / 127.6 MB (full)
28
 
29
  ## Labels
30
 
31
+ | Label | Description |
32
+ |-------|-------------|
33
+ | `confirmation` | Application received/confirmed |
34
+ | `rejection` | Application rejected |
35
+ | `interview` | Interview invitation |
36
+ | `offer` | Job offer |
37
+ | `not_job` | Not job-related |
38
 
39
+ ## Training
40
 
41
+ - **Method:** Curriculum learning (2-2-1 epochs)
42
+ - Phase 1: 2 epochs on full-body emails
43
+ - Phase 2: 2 epochs with 4:1 full-body:snippet mix
44
+ - Phase 3: 1 epoch with 1:1 balanced mix
45
+ - **Training Data:** ~28K emails (original + augmented snippets)
46
+ - **Validation Accuracy:** 100% (full-body), 100% (snippet)
47
 
48
+ ## Usage with Transformers.js
49
 
50
  ```javascript
51
+ import { pipeline } from '@xenova/transformers';
52
 
53
+ const classifier = await pipeline(
54
+ 'text-classification',
55
+ 'mattohan/job-tracker-email-classifier',
56
+ { quantized: true }
57
  );
58
 
59
+ const result = await classifier('Thank you for applying to the Software Engineer position...');
60
+ // [{ label: 'confirmation', score: 0.99 }]
 
 
 
61
  ```
62
 
63
  ## Files
64
 
65
+ ```
66
+ β”œβ”€β”€ config.json
67
+ β”œβ”€β”€ tokenizer.json
68
+ β”œβ”€β”€ tokenizer_config.json
69
+ β”œβ”€β”€ vocab.txt
70
+ β”œβ”€β”€ special_tokens_map.json
71
+ └── onnx/
72
+ β”œβ”€β”€ model.onnx # Full model (127.6 MB)
73
+ └── model_quantized.onnx # Quantized model (32.5 MB)
74
+ ```
75
 
76
  ## Privacy
77