mattohan
/

job-tracker-email-classifier

@@ -1,11 +1,12 @@
 ---
-library_name: onnx
 tags:
   - email-classification
   - job-search
   - onnx
   - transformers.js
   - browser-ml
 license: mit
 datasets:
   - custom
@@ -16,52 +17,61 @@ pipeline_tag: text-classification
 # Email Classifier for Job Applications
-A fine-tuned BGE-small model that classifies emails into job application categories. Designed to run entirely in the browser using ONNX Runtime Web.
 ## Model Description
-- **Base Model:** BAAI/bge-small-en-v1.5
 - **Task:** 5-class email classification
-- **Format:** ONNX (optimized for browser inference)
-- **Size:** ~128MB
 ## Labels
-| Label | Description | Application Status |
-|-------|-------------|-------------------|
-| `confirmation` | Application received/confirmed | Applied |
-| `rejection` | Application rejected | Rejected |
-| `interview` | Interview invitation | Interviewing |
-| `offer` | Job offer | Offer |
-| `not_job` | Not job-related | - |
-## Performance
-- **Validation Accuracy:** 99.65%
-- **Training Data:** 28,500 synthetic + curated emails
-## Usage with ONNX Runtime Web
 ```javascript
-import * as ort from 'onnxruntime-web';
-// Load model
-const session = await ort.InferenceSession.create(
-  'https://huggingface.co/YOUR_USERNAME/email-classifier/resolve/main/model.onnx'
 );
-// Tokenize and run inference
-const results = await session.run({
-  input_ids: inputIdsTensor,
-  attention_mask: attentionMaskTensor,
-});
 ```
 ## Files
-- `model.onnx` - The ONNX model file
-- `vocab.txt` - Vocabulary file for tokenization
-- `config.json` - Model configuration
 ## Privacy

 ---
+library_name: transformers.js
 tags:
   - email-classification
   - job-search
   - onnx
   - transformers.js
   - browser-ml
+  - bert
 license: mit
 datasets:
   - custom
 # Email Classifier for Job Applications
+A fine-tuned BGE-small model that classifies emails into job application categories. Designed to run entirely in the browser using Transformers.js.
 ## Model Description
+- **Base Model:** BAAI/bge-small-en-v1.5 (33M parameters)
 - **Task:** 5-class email classification
+- **Format:** ONNX (opset 14, IR version 7)
+- **Size:** 32.5 MB (quantized) / 127.6 MB (full)
 ## Labels
+| Label | Description |
+|-------|-------------|
+| `confirmation` | Application received/confirmed |
+| `rejection` | Application rejected |
+| `interview` | Interview invitation |
+| `offer` | Job offer |
+| `not_job` | Not job-related |
+## Training
+- **Method:** Curriculum learning (2-2-1 epochs)
+  - Phase 1: 2 epochs on full-body emails
+  - Phase 2: 2 epochs with 4:1 full-body:snippet mix
+  - Phase 3: 1 epoch with 1:1 balanced mix
+- **Training Data:** ~28K emails (original + augmented snippets)
+- **Validation Accuracy:** 100% (full-body), 100% (snippet)
+## Usage with Transformers.js
 ```javascript
+import { pipeline } from '@xenova/transformers';
+const classifier = await pipeline(
+  'text-classification',
+  'mattohan/job-tracker-email-classifier',
+  { quantized: true }
 );
+const result = await classifier('Thank you for applying to the Software Engineer position...');
+// [{ label: 'confirmation', score: 0.99 }]
 ```
 ## Files
+```
+├── config.json
+├── tokenizer.json
+├── tokenizer_config.json
+├── vocab.txt
+├── special_tokens_map.json
+└── onnx/
+    ├── model.onnx           # Full model (127.6 MB)
+    └── model_quantized.onnx # Quantized model (32.5 MB)
+```
 ## Privacy