Adya662 commited on
Commit
46b5225
·
verified ·
1 Parent(s): 38b6598

Initial upload of BERT-Tiny AMD classifier

Browse files
README.md ADDED
@@ -0,0 +1,141 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - text-classification
7
+ - answering-machine-detection
8
+ - bert-tiny
9
+ - binary-classification
10
+ - call-center
11
+ - voice-processing
12
+ pipeline_tag: text-classification
13
+ ---
14
+
15
+ # BERT-Tiny AMD Classifier
16
+
17
+ A lightweight BERT-Tiny model fine-tuned for Answering Machine Detection (AMD) in call center environments.
18
+
19
+ ## Model Description
20
+
21
+ This model is based on `prajjwal1/bert-tiny` and fine-tuned to classify phone call transcripts as either human or machine (answering machine/voicemail) responses. It's designed for real-time call center applications where quick and accurate detection of answering machines is crucial.
22
+
23
+ ## Model Architecture
24
+
25
+ - **Base Model**: `prajjwal1/bert-tiny` (2 layers, 128 hidden size, 2 attention heads)
26
+ - **Total Parameters**: ~4.4M (lightweight and efficient)
27
+ - **Input**: User transcript text (max 128 tokens)
28
+ - **Output**: Single logit with sigmoid activation for binary classification
29
+ - **Loss Function**: BCEWithLogitsLoss with positive weight for class imbalance
30
+
31
+ ## Performance
32
+
33
+ - **Validation Accuracy**: 97.75%
34
+ - **Precision**: 95.79%
35
+ - **Recall**: 95.79%
36
+ - **F1-Score**: 95.79%
37
+ - **Agreement with Rule-based System**: 97.75%
38
+
39
+ ## Training Data
40
+
41
+ - **Total Samples**: 3,548 phone call transcripts
42
+ - **Training Set**: 2,838 samples
43
+ - **Validation Set**: 710 samples
44
+ - **Class Distribution**: 26.8% machine calls, 73.2% human calls
45
+ - **Source**: ElevateNow call center data
46
+
47
+ ## Usage
48
+
49
+ ### Basic Inference
50
+
51
+ ```python
52
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
53
+ import torch
54
+
55
+ # Load model and tokenizer
56
+ model = AutoModelForSequenceClassification.from_pretrained("your-username/bert-tiny-amd")
57
+ tokenizer = AutoTokenizer.from_pretrained("your-username/bert-tiny-amd")
58
+
59
+ # Prepare input
60
+ text = "Hello, this is John speaking"
61
+ inputs = tokenizer(text, return_tensors="pt", max_length=128, truncation=True, padding=True)
62
+
63
+ # Make prediction
64
+ with torch.no_grad():
65
+ outputs = model(**inputs)
66
+ logits = outputs.logits.squeeze(-1)
67
+ probability = torch.sigmoid(logits).item()
68
+ is_machine = probability >= 0.5
69
+
70
+ print(f"Prediction: {'Machine' if is_machine else 'Human'}")
71
+ print(f"Confidence: {probability:.4f}")
72
+ ```
73
+
74
+ ### Production Usage
75
+
76
+ ```python
77
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
78
+ import torch
79
+
80
+ class AMDClassifier:
81
+ def __init__(self, model_name="your-username/bert-tiny-amd"):
82
+ self.model = AutoModelForSequenceClassification.from_pretrained(model_name)
83
+ self.tokenizer = AutoTokenizer.from_pretrained(model_name)
84
+ self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
85
+ self.model.to(self.device)
86
+ self.model.eval()
87
+
88
+ def predict(self, transcript_text, threshold=0.5):
89
+ """Predict if transcript is from answering machine"""
90
+ inputs = self.tokenizer(
91
+ transcript_text,
92
+ return_tensors="pt",
93
+ max_length=128,
94
+ truncation=True,
95
+ padding=True
96
+ ).to(self.device)
97
+
98
+ with torch.no_grad():
99
+ outputs = self.model(**inputs)
100
+ logits = outputs.logits.squeeze(-1)
101
+ probability = torch.sigmoid(logits).item()
102
+ is_machine = probability >= threshold
103
+
104
+ return is_machine, probability
105
+
106
+ # Usage
107
+ classifier = AMDClassifier()
108
+ is_machine, confidence = classifier.predict("Hello, this is John speaking")
109
+ ```
110
+
111
+ ## Training Details
112
+
113
+ - **Optimizer**: AdamW with weight decay (0.01)
114
+ - **Learning Rate**: 3e-5 with linear scheduling
115
+ - **Batch Size**: 32
116
+ - **Epochs**: 12 (with early stopping)
117
+ - **Early Stopping**: Patience of 3 epochs
118
+ - **Class Imbalance**: Handled with positive weight (2.729)
119
+
120
+ ## Limitations
121
+
122
+ - Trained on English phone call transcripts
123
+ - May not generalize well to other languages or domains
124
+ - Performance may vary with different transcription quality
125
+ - Designed for short utterances (max 128 tokens)
126
+
127
+ ## Citation
128
+
129
+ ```bibtex
130
+ @misc{bert-tiny-amd,
131
+ title={BERT-Tiny AMD Classifier for Answering Machine Detection},
132
+ author={Your Name},
133
+ year={2025},
134
+ publisher={Hugging Face},
135
+ howpublished={\url{https://huggingface.co/your-username/bert-tiny-amd}}
136
+ }
137
+ ```
138
+
139
+ ## License
140
+
141
+ MIT License - see LICENSE file for details.
config.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertForSequenceClassification"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "hidden_act": "gelu",
8
+ "hidden_dropout_prob": 0.1,
9
+ "hidden_size": 128,
10
+ "id2label": {
11
+ "0": "LABEL_0"
12
+ },
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 512,
15
+ "label2id": {
16
+ "LABEL_0": 0
17
+ },
18
+ "layer_norm_eps": 1e-12,
19
+ "max_position_embeddings": 512,
20
+ "model_type": "bert",
21
+ "num_attention_heads": 2,
22
+ "num_hidden_layers": 2,
23
+ "pad_token_id": 0,
24
+ "position_embedding_type": "absolute",
25
+ "torch_dtype": "float32",
26
+ "transformers_version": "4.54.0",
27
+ "type_vocab_size": 2,
28
+ "use_cache": true,
29
+ "vocab_size": 30522
30
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:079d2a82939195cf53e63521c9efc0cb4133e012d790625e511544f649069651
3
+ size 17548796
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 1000000000000000019884624838656,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
training_metadata.json ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "training_config": {
3
+ "model_name": "prajjwal1/bert-tiny",
4
+ "max_length": 128,
5
+ "batch_size": 32,
6
+ "learning_rate": 3e-05,
7
+ "num_epochs": 15,
8
+ "patience": 3,
9
+ "test_size": 0.2,
10
+ "device": "cpu",
11
+ "csv_file": "all_EN_calls.csv",
12
+ "s3_bucket": "voicex-call-recordings"
13
+ },
14
+ "final_metrics": {
15
+ "accuracy": 0.9774647887323944,
16
+ "precision": 0.9578947368421052,
17
+ "recall": 0.9578947368421052,
18
+ "f1": 0.9578947368421052,
19
+ "confusion_matrix": [
20
+ [
21
+ 512,
22
+ 8
23
+ ],
24
+ [
25
+ 8,
26
+ 182
27
+ ]
28
+ ]
29
+ },
30
+ "pos_weight": 2.729303547963206,
31
+ "threshold": 0.5,
32
+ "training_history": {
33
+ "train_losses": [
34
+ 0.9435733710781912,
35
+ 0.6628189873829317,
36
+ 0.40206739406907155,
37
+ 0.28053958831208475,
38
+ 0.21479346770583913,
39
+ 0.180794070108553,
40
+ 0.14911148521337617,
41
+ 0.13325696530636777,
42
+ 0.12835281459468134,
43
+ 0.11012767288792,
44
+ 0.10539512767383222,
45
+ 0.09656323011169272
46
+ ],
47
+ "val_losses": [
48
+ 0.8112381230229917,
49
+ 0.4864982029666071,
50
+ 0.34563232180864917,
51
+ 0.26932784072730853,
52
+ 0.24466017180162927,
53
+ 0.2034845212879388,
54
+ 0.1938699973018273,
55
+ 0.19390630900211955,
56
+ 0.1721272283922071,
57
+ 0.17268858526064002,
58
+ 0.17224800457125125,
59
+ 0.18237287065257196
60
+ ],
61
+ "val_accuracies": [
62
+ 0.9042253521126761,
63
+ 0.9690140845070423,
64
+ 0.9704225352112676,
65
+ 0.971830985915493,
66
+ 0.9690140845070423,
67
+ 0.976056338028169,
68
+ 0.9746478873239437,
69
+ 0.9746478873239437,
70
+ 0.9774647887323944,
71
+ 0.9732394366197183,
72
+ 0.9746478873239437,
73
+ 0.9774647887323944
74
+ ]
75
+ }
76
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff