skgezhil2005 commited on
Commit
53d9152
·
1 Parent(s): de032fb

updated README.md

Browse files
Files changed (2) hide show
  1. .gitattributes +2 -0
  2. README.md +45 -0
.gitattributes CHANGED
@@ -7,6 +7,7 @@
7
  *.gz filter=lfs diff=lfs merge=lfs -text
8
  *.h5 filter=lfs diff=lfs merge=lfs -text
9
  *.joblib filter=lfs diff=lfs merge=lfs -text
 
10
  *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
  *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
  *.model filter=lfs diff=lfs merge=lfs -text
@@ -19,6 +20,7 @@
19
  *.pb filter=lfs diff=lfs merge=lfs -text
20
  *.pickle filter=lfs diff=lfs merge=lfs -text
21
  *.pkl filter=lfs diff=lfs merge=lfs -text
 
22
  *.pt filter=lfs diff=lfs merge=lfs -text
23
  *.pth filter=lfs diff=lfs merge=lfs -text
24
  *.rar filter=lfs diff=lfs merge=lfs -text
 
7
  *.gz filter=lfs diff=lfs merge=lfs -text
8
  *.h5 filter=lfs diff=lfs merge=lfs -text
9
  *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.keras filter=lfs diff=lfs merge=lfs -text
11
  *.lfs.* filter=lfs diff=lfs merge=lfs -text
12
  *.mlmodel filter=lfs diff=lfs merge=lfs -text
13
  *.model filter=lfs diff=lfs merge=lfs -text
 
20
  *.pb filter=lfs diff=lfs merge=lfs -text
21
  *.pickle filter=lfs diff=lfs merge=lfs -text
22
  *.pkl filter=lfs diff=lfs merge=lfs -text
23
+ *.png filter=lfs diff=lfs merge=lfs -text
24
  *.pt filter=lfs diff=lfs merge=lfs -text
25
  *.pth filter=lfs diff=lfs merge=lfs -text
26
  *.rar filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,48 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # Email Classifier
5
+
6
+ This project implements an email classification model that assigns each email to a specific category using SBERT (all-minilm-l6-v2) for text embeddings, followed by a sequential neural network for final classification.
7
+ ## Model Description
8
+ - **Architecture:** SBERT (384‑d) → Dense(128, ReLU) → Dense(64, ReLU) → Softmax(3)
9
+ - **Frameworks:** TensorFlow2.17, sentence‑transformer
10
+
11
+ ## Training Data & Preprocessing
12
+ - **Emails:** 4954 college emails, manually labeled into `[Academics, Clubs, Internships, Others, Talks]`
13
+ - **Split:** 80% train / 20% test
14
+ - **Embedding & Labeling:**
15
+ 1. Each email was embedded with `all‑MiniLM‑L6‑v2` (SBERT).
16
+ 2. We created a small “prototype” set of example sentences for each category.
17
+ 3. For every email, we computed cosine similarities between its SBERT embedding and each prototype embedding.
18
+ 4. The email was assigned to the category whose prototype had the **highest** cosine score (threshold ≥ 0.4).
19
+
20
+ ## Evaluation
21
+
22
+ The model was tested on **991** college‑email samples. Below are the per‑class precision, recall, F1‑score and support:
23
+
24
+ | Class | label | Support | Precision | Recall | F1‑Score |
25
+ |:-----:|-------------|--------:|----------:|-------:|---------:|
26
+ | 0 | Academics | 200 | 0.92 | 0.97 | 0.94 |
27
+ | 1 | Clubs | 236 | 0.94 | 0.96 | 0.95 |
28
+ | 2 | Internships | 143 | 0.95 | 0.98 | 0.97 |
29
+ | 3 | Others | 200 | 0.95 | 0.83 | 0.89 |
30
+ | 4 | Takls | 212 | 0.93 | 0.94 | 0.93 |
31
+
32
+ \
33
+ **Aggregate metrics**
34
+
35
+ | Metric | Accuracy | Precision | Recall | F1‑Score |
36
+ |:-------------|---------:|----------:|-------:|---------:|
37
+ | Overall | 0.94 | — | — | — |
38
+ | Macro avg | — | 0.94 | 0.94 | 0.94 |
39
+ | Weighted avg | — | 0.94 | 0.94 | 0.93 |
40
+
41
+ ### Confusion Matrix
42
+
43
+ ![Confusion Matrix](cm.png)
44
+
45
+ ## Usage
46
+
47
+ ```bash
48
+ pip install tensorflow sentence-transformers